Jump to content

Imported an sql database into excel but thai text not displaying properly


skraach

Recommended Posts

I have a large (20,000 entries) database in .sql format that I want to edit. I imported it into Excel and the English content displays correctly, but the Thai looks like this:

เอกสารนี้สิ้นสุด

Can excel convert this to display the Thai characters?

Should I have specified the encoding during the import process? What is the name of this encoding?

Link to comment
Share on other sites

What you have here is a load of numeric character entities (as in HTML) - thus เ = เ = sara e.

If they are in the original file, and the original file is indeed a text file, my first approach without a specialist program would be to tack on a '.htm', extension, open the file in a browser, select the whole page, and save it in a text file. This approach will probably not work if the data is separated by tabs; sequences of white space will be reduced to a single character. The text you have quoted translates as 'this document terminates', being letter by letter SARA E, O ANG, KO KAI, SO SUA, SARA AA, RO RUA, NO NU, SARA II, MAI THO, SO SUEA, SARA I, MAI THO, NO NU, SO SUA, SARA U, DO DEK.

What character encoding did Excel think this document was in? I would guess that it was in ASCII, but it might be UTF-16 (generally = 'Unicode' to Microsoft applications).

Another approach is to save the file in CSV format and write your own program to convert the numbers to characters - or try the above trick of using a browser on the saved CSV file.

Link to comment
Share on other sites

Thinking a bit beyond the original problem, I'm guessing that the database is used to provide data for a web-based application. Can't think of any other reason why the data should be in its current format.

That the data are stored in the database as HTML character entities leads me to think that the specified character set of the web pages where the data are displayed is not UTF-8.

If you convert the character entities to Unicode and then import the edited data back into the database, the web application will not display the data correctly.

Probably the sensible thing to do would be to change everything to work with Unicode: data, database, web pages. That may not be a trivial task, depending upon the application. The alternative would be to convert the converted data back to character entities.

Link to comment
Share on other sites

Posted Yesterday, 12:57

I have a large (20,000 entries) database in .sql format that I want to edit. I imported it into Excel and the English content displays correctly, but the Thai looks like this:

เอกสารนี้สิ้นสุด

Can excel convert this to display the Thai characters?

Should I have specified the encoding during the import process? What is the name of this encoding?

As already stated, text encoding is Unicode.

Why not use an SQL viewer?

Are you using Excel because you want to view/analyze a copy of the data, or are you hoping to edit/manipulate and export it back???

Editing an SQL Server table in Excel

Link to comment
Share on other sites

As already stated, text encoding is Unicode.

Why not use an SQL viewer?

Are you using Excel because you want to view/analyze a copy of the data, or are you hoping to edit/manipulate and export it back???

I don't believe it was already stated that the text encoding is Unicode. In fact, I think it far more probable that the encoding is ASCII or a close relative, which would explain why the database data include HTML character entities.

The questions, however, would if answered help suggest an appropriate solution.

Link to comment
Share on other sites

As already stated, text encoding is Unicode.

Why not use an SQL viewer?

Are you using Excel because you want to view/analyze a copy of the data, or are you hoping to edit/manipulate and export it back???

I don't believe it was already stated that the text encoding is Unicode. In fact, I think it far more probable that the encoding is ASCII or a close relative, which would explain why the database data include HTML character entities.

The questions, however, would if answered help suggest an appropriate solution.

Yes, True. Both UnicodeLookup and {UTF-8} icons show they are HTML/XML special character references (decimal form).

And, while it's neat that many things can be imported into speadsheet apps like Excel, and I might use it to quickly view data -- I wouldn't rely on a straight import of am SQL database and definitely wouldn't export it back. There are far more reliable tools available.

Hopefully the OP will pop back in and answer the initial question.

Link to comment
Share on other sites

  • 2 weeks later...

I didn't have any luck with trying to convert the file, so what I ended up doing was using Excel's Find & Replace:

Find: ก Replace: ก
FInd: ข Replace: ข
Find: ฃ Replace: ฃ

etc

This took a little while, but did the trick.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.







×
×
  • Create New...