Wednesday 23 January 2013

Argh! BOM, UTF-8, and solution

Potentially useful rant: If you ever have a problem importing and analyzing what you believe is a standard .csv file (for examples using Python and Pandas' read.csv), you may want to know that sometimes the .csv file contains a hidden code (details about the encoding used, such as UTF-8 etc, BOM). After wasting too much time discovering and dealing with this, I found a quick solution: Open the .csv file in Notebook ++, go to Encoding and select "Encode in UTF-8 without BOM." Save the file again and the problem is gone.

No comments:

Post a Comment