skraach Posted May 8, 2014 Share Posted May 8, 2014 I have a large (20,000 entries) database in .sql format that I want to edit. I imported it into Excel and the English content displays correctly, but the Thai looks like this: เอกสารนี้สิ้นสุด Can excel convert this to display the Thai characters? Should I have specified the encoding during the import process? What is the name of this encoding? Link to comment Share on other sites More sharing options...
craigt3365 Posted May 8, 2014 Share Posted May 8, 2014 I think this is probably better off over in the techie forum. Topic Moved. Link to comment Share on other sites More sharing options...
Richard W Posted May 8, 2014 Share Posted May 8, 2014 What you have here is a load of numeric character entities (as in HTML) - thus เ = เ = sara e. If they are in the original file, and the original file is indeed a text file, my first approach without a specialist program would be to tack on a '.htm', extension, open the file in a browser, select the whole page, and save it in a text file. This approach will probably not work if the data is separated by tabs; sequences of white space will be reduced to a single character. The text you have quoted translates as 'this document terminates', being letter by letter SARA E, O ANG, KO KAI, SO SUA, SARA AA, RO RUA, NO NU, SARA II, MAI THO, SO SUEA, SARA I, MAI THO, NO NU, SO SUA, SARA U, DO DEK. What character encoding did Excel think this document was in? I would guess that it was in ASCII, but it might be UTF-16 (generally = 'Unicode' to Microsoft applications). Another approach is to save the file in CSV format and write your own program to convert the numbers to characters - or try the above trick of using a browser on the saved CSV file. Link to comment Share on other sites More sharing options...
AyG Posted May 9, 2014 Share Posted May 9, 2014 If you Google for "convert html character entities to utf-8" you'll find a number of on-line and off-line converters. Using one of these (link below), the OP's text converts to เอกสารนี้สิ้นสุด. Such a converter should preserve tabs and white space. http://unicode.online-toolz.com/tools/unicode-html-entities-convertor.php Link to comment Share on other sites More sharing options...
AyG Posted May 9, 2014 Share Posted May 9, 2014 Thinking a bit beyond the original problem, I'm guessing that the database is used to provide data for a web-based application. Can't think of any other reason why the data should be in its current format. That the data are stored in the database as HTML character entities leads me to think that the specified character set of the web pages where the data are displayed is not UTF-8. If you convert the character entities to Unicode and then import the edited data back into the database, the web application will not display the data correctly. Probably the sensible thing to do would be to change everything to work with Unicode: data, database, web pages. That may not be a trivial task, depending upon the application. The alternative would be to convert the converted data back to character entities. Link to comment Share on other sites More sharing options...
RichCor Posted May 9, 2014 Share Posted May 9, 2014 Posted Yesterday, 12:57 I have a large (20,000 entries) database in .sql format that I want to edit. I imported it into Excel and the English content displays correctly, but the Thai looks like this: เอกสารนี้สิ้นสุด Can excel convert this to display the Thai characters? Should I have specified the encoding during the import process? What is the name of this encoding? As already stated, text encoding is Unicode. Why not use an SQL viewer? Are you using Excel because you want to view/analyze a copy of the data, or are you hoping to edit/manipulate and export it back??? Editing an SQL Server table in Excel Link to comment Share on other sites More sharing options...
AyG Posted May 9, 2014 Share Posted May 9, 2014 As already stated, text encoding is Unicode. Why not use an SQL viewer? Are you using Excel because you want to view/analyze a copy of the data, or are you hoping to edit/manipulate and export it back??? I don't believe it was already stated that the text encoding is Unicode. In fact, I think it far more probable that the encoding is ASCII or a close relative, which would explain why the database data include HTML character entities. The questions, however, would if answered help suggest an appropriate solution. Link to comment Share on other sites More sharing options...
RichCor Posted May 9, 2014 Share Posted May 9, 2014 As already stated, text encoding is Unicode. Why not use an SQL viewer? Are you using Excel because you want to view/analyze a copy of the data, or are you hoping to edit/manipulate and export it back??? I don't believe it was already stated that the text encoding is Unicode. In fact, I think it far more probable that the encoding is ASCII or a close relative, which would explain why the database data include HTML character entities. The questions, however, would if answered help suggest an appropriate solution. Yes, True. Both UnicodeLookup and {UTF-8} icons show they are HTML/XML special character references (decimal form). And, while it's neat that many things can be imported into speadsheet apps like Excel, and I might use it to quickly view data -- I wouldn't rely on a straight import of am SQL database and definitely wouldn't export it back. There are far more reliable tools available. Hopefully the OP will pop back in and answer the initial question. Link to comment Share on other sites More sharing options...
skraach Posted May 22, 2014 Author Share Posted May 22, 2014 I didn't have any luck with trying to convert the file, so what I ended up doing was using Excel's Find & Replace: Find: ก Replace: กFInd: ข Replace: ขFind: ฃ Replace: ฃ etc This took a little while, but did the trick. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now