Character Encoding

Encode Special Characters in Markup

Have you ever wanted to encode a giant chunk of HTML Markup (or other source code) into HTML Entities, NCR or Unicode Characters References?

For example, maybe you’ve found some chunk of text on-line — in one of your old blogs perhaps, and you want to copy and paste it into a new bit of HTML markup. Proceeding to copy / paste the text from the former web page into your new code, you find that upon publishing the code, some existing HTML tags which were presented using Character Entities (such as &lt; for the [ < ] symbol, enclosed in <pre> tags to appear as a simulated code view) are now being interpreted as Markup in the new page. To your dismay, you realize that in order for those HTML <code> tags to be viewed as code again by your new viewers, you have to go back through the source code markup with a fine-tooth comb, searching for the instances of the HTML tags which were supposed to be presented using Entities.

Encoding HTML Tags into Entities : Quick and Painless!

If the above scenario is familiar to you– whether it’s a work-place duty bestowed upon you, or your own personal projects, you are in luck because there are software applications which exist for doing the encoding for you.

To be sure, you may already have a text-editor which will perform this task for you. Look in your menus for commands such as “strip HTML tags”, or “Convert HTML to Entities”, or anything with the words Unicode, Character Encoding, or Entities. Give it a try– the worst that can happen is that you may have to hit the undo button (but be sure to save a backup file… just in case).

If you don’t already own a software app which will do the Character Encoding wizardry for you, then i recommend 100% a very nice little free application named BabelPad by BabelStone software.

If you decide to go this route, then once you’ve installed BabelPad on your system (Linux users can use WinE, as i’ve tested and used it successfully in Fedora Core 5 myself), simply copy that chunk of text with “unwanted HTML markup into the BabelPad main editor window (that chunk of text which came from cutting and pasting code from its original location), highlight the section of text which you wish to modify, then right-click on that selected section to pull up the BablePad context menu. From the context menu, select Convert Unicode to HTML Entities and viola! — you’ve got a chunk of code which you can now publish your new code without the worry that the old HTML tags presented in the original page-view will affect your own new markup.

good luck! (note: BablePad is not the only software which performs this conversion, but it is one of the more full-featured softwares, specific to Character Encoding that i’ve ever seen. i highly recommend it as part of your software arsenal)

see also Character Encodings and HTML Entities

BACK TO TOP | All Content © 2006 - 2009, NoviceNotes™ | © 2009 NoviceNotes.Net