CSV Data Source Unrecognised Character

(Nick Papadatos) #1

Matrix Version: 5.5

Hi Squiz folk

I have this weird thing happen where I get this space (can’t see the space) or unrecognised character as seen in image (red background, white dot)

If I delete the row with the issue it moves to the next row etc.
Can anyone tell me what this is and how to fix it?

Matrix csv data source and .xlsx saved as CSV - UTF 8 (comma-delimited)


(Hugh McMaster) #2

Excel adds a Byte Order Mark (BOM) when saving files with UTF-8 encoding.

Squiz doesn’t skip the BOM when parsing a CSV file, so you need to manually remove the BOM before uploading.

Programs such as Notepad++ can do this easily. There’s a menu option to change file encoding from UTF-8 (BOM) to UTF-8.

I reported this issue to Squiz over a year ago. Unfortunately, it had little response from the dev team.

(Nick Papadatos) #3

Hmm, thank you Hugh.
I mainly work on a Mac but will get out the WIN laptop (that has Notepad++)

PS - come on Squiz dev, get a fix pls.


(Nick Papadatos) #4

Another work around for anyone that’s interested is to do the following:

  1. Create a Paint Layout:
  • Default Format:

      <script runat="server">
        var csvData = %asset_data_attributes^index:cached_content^index:value^empty:[]%;
        print('"' + csvData.map(function(row) {
         return row.map(function(cell) {
             return cell.replace(/[\u200B-\u200D\uFEFF]/g, '').replace(/"/g, '""').trim();
         }).join('","').replace(/\s+/g, ' ');
       }).join('"\n"').replace(/\uFEFF/, '') + '"');

Now this task is complete, leave Page Contents as is.

  1. On your CSV data source apply that Paint Layout
  2. Staying on your CSV data source, apply/Override Design to a CSV parse file.
  3. Preview your CSV data source and it should download a clean .CSV version.
  4. Re upload that clean version to the data source - should now be ok.