Office 365/Excel generates incorrect CSV UTF-8 format files

Hi, I am having problems saving files using the "CSV UTF-8 (comma delimited)" format which is the first of 4 CSV formats in my "Save as type" drop down list.

Consider the Excel file:

ZIP FirstName LastName
90210 Jonathan "jon" Doe

When I save this as a "CSV UTF-8" type I get:

ZIP,FirstName,LastName
90210,"Jonathan "jon"""onat",Doe

The handling of the " characters is wrong, I am guessing that this is some LTR/RTL thing but I asked for 8 bit characters, not RTL.

When I save as a "CSV (comma delimited)"  I get the correct result, which aligns to the ISO standard (who knew that there was one!!):

ZIP,FirstName,LastName
90210,"Jonathan ""jon""",Doe

I have not played with the other 2 CSV types; "CSV (Windows)" or "CSV (Macintosh)" so cannot comment on them.

So to the question; is this a bug or desired behaviour?  If the latter can you explain please because even Excel will not read it back in correctly.

Some versions:

  • OS = Windows 10
  • Office = Office 365, version 1610 Build 7466.2038
  • Locale is UK English so decimal = "." and thousand separator = ",".  I don't have any other locales in use, I even removed the US English one when I set the machine up.

Thanks for your help.

Ps - I have started to use the 2nd format so I am ok for the time being but I have spent about 8 hours trying to work out why this data will not load into my database correctly, this was on row 78,000 in my data, not row 1.  I had assumed that Excel was perfect (as it usually is) so had not looked here.

Answer
Answer

Hi Mat,

To assist you with your concern, we suggest you to follow the steps provided below on how to save a .csv UFT-8 file format:

  1. Open the "*.xlsc" file on Microsoft Excel.
  2. Click Menu > Save As.
  3. Enter the desired name for your file.
  4. On " Save as type," select Unicode Text.
  5. Click Save.
  6. Open your saved file in Microsoft Notepad.
  7. Replace all tab characters with commas (",").
    • Select a tab character (select and copy the space between two column headers).
    • Open the "Find and Replace" window (Press Ctrl+H) and replace all tab characters with comma.
  8. Click Save As.
  9. Name the file, and change the Encoding: to UTF-8.
  10. Change the file extension from "*.txt" to "*.csv".
  11. Click Save.
  12. Open the .csv file in Excel to view your data.

If you encounter problems following the above steps, don't save your file in Excel because it will cause encoding issues. Hence, follow the troubleshooting steps below:

  1. Find the file.
  2. Right click the file > select Open with > Notepad.
  3. Select File > Save As.
  4. Navigate to the folder where you want to save your file.
  5. Enter a name for your file and append .csv to the end of the file name.
  6. Select UTF-8 encoding.
  7. Click Save.
  8. Open the newly created file in Excel to view your data.

Update us on the result.

Thank you.

6 people found this reply helpful

·

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

 
 

Question Info


Last updated November 29, 2023 Views 10,786 Applies to: