Double quotes should be escaped in data.gov catalog csv

From Data-gov Wiki

Jump to: navigation, search
Infobox (Issue Report) edit with form
  • name: Double quotes should be escaped in data.gov catalog csv

  • description: found wrong "Data.gov Data Category Type" in dataset 92
  • creator(s): Li Ding
  • created: 2009-08-18
  • modified: 2010-5-16


Symptom

Field values were placed in wrong columns in http://www.data.gov/data_gov_catalog.csv. (i.e. Dataset 92 (Data.gov Catalog - Complete, Executive Office of the President). This has been observed in over 20 rows. To see this point, download the csv file and open it using Microsoft Excel.

  • check "Data.gov Data Category Type"
Diagnose
  • since text fields were wrapped by double quote, the double quotes inside field value should be escaped by two double quotes. Following are some correct usages
11982,"note: ""this is, a test"" for double quote","<a href=""http://foo.com"">http://foo.com</a>",123

the csv carries the following fields

11982
note: "this is, a test" for double quote
<a href="http://foo.com">http://foo.com</a>
123

Typically, double quotes were found in fields, e.g. "Description", "Citation", which contains either text descriptions or HTTP links.

How to Fix
  • before write to CSV, escape each text field by replacing " (one double quote) with "" (two double quotes)
Facts about Double quotes should be escaped in data.gov catalog csvRDF feed
Dcterms:created18 August 2009  +
Dcterms:creatorLi Ding  +
Dcterms:descriptionfound wrong "Data.gov Data Category Type" in dataset 92
Dcterms:modified2010-5-16
Foaf:nameDouble quotes should be escaped in data.gov catalog csv
Skos:altLabelDouble quotes should be escaped in data.gov catalog csv  +, double quotes should be escaped in data.gov catalog csv  +, and DOUBLE QUOTES SHOULD BE ESCAPED IN DATA.GOV CATALOG CSV  +
Personal tools
internal pages