Skip to content

A GUI tool that allows you to validate the text encoding of one or more files. Modified from https://encodingchecker.codeplex.com/

License

Notifications You must be signed in to change notification settings

Albert-Coder/EncodingChecker

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build status

EncodingChecker v2.0

File Encoding Checker is a GUI tool that allows you to validate the text encoding of one or more files. The tool can display the encoding for all selected files, or only the files that do not have the encodings you specify.

File Encoding Checker requires Microsoft .NET Framework 4 to run.

form image

Fixed issues

Sorting the results by clicking a column header is working now.

When viewing a directory, some files matching the file masks were not listed.

Improved performance of the list view control for faster processing of results.

Added feature to export selected results to a text file.

Switched to UtfUnknown library for better encoding detection (Multiple bugs from Ude fixed).

Validating the detected file encoding to avoid errors during conversion of files.

UTF-16 text files without byte-order-mark (BOM) can be detected by heuristics.

Credits

The original project EncodingChecker on CodePlex was written by Jeevan James.

For encoding detection, File Encoding Checker uses the UtfUnknown library, which is a C# port of uchardet library - A C++ port of the original Mozilla Universal Charset Detector.

Supported Charsets

File Encoding Checker currently supports over forty charsets.

  • ASCII
  • UTF-7 (with a BOM)
  • UTF-8 (with or without a BOM)
  • UTF-16 BE or LE (with or without a BOM)
  • UTF-32 BE or LE (with a BOM)
  • Arabic: iso-8859-6, windows-1256.
  • Baltic: iso-8859-4, windows-1257.
  • Central European: ibm852, iso-8859-2, windows-1250, x-mac-ce.
  • Chinese (Traditional and Simplified): big5, GB18030, hz-gb-2312, x-cp50227.
  • Cyrillic (primarily Russian): IBM855, cp866, iso-8859-5, koi8-r, windows-1251, x-mac-cyrillic.
  • Estonian: iso-8859-13.
  • Greek: iso-8859-7, windows-1253.
  • Hebrew: iso-8859-8, windows-1255.
  • Japanese: euc-jp, iso-2022-jp, shift_jis.
  • Korean: euc-kr, iso-2022-kr, ks_c_5601-1987 (cp949).
  • Thai: windows-874 (aliases TIS-620 and iso-8859-11 in .NET)
  • Turkish: iso-8859-3, iso-8859-9.
  • Western European: iso-8859-1, iso-8859-15, windows-1252.
  • Vietnamese: windows-1258.

About

A GUI tool that allows you to validate the text encoding of one or more files. Modified from https://encodingchecker.codeplex.com/

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C# 100.0%