On Jul 2, 11:31 am, Mike Samuel ‹mikesam...@gmail.com› wrote:
› 2009/7/1 Xah Lee ‹xah...@gmail.com›:

› › Hi Mike,

› › Thanks a lot for the answer.

› › is it possible for prettify automatically replace the ‹› chars to
› › entities?

› No. Since prettify reads the HTML, the input has to be properly
› formed HTML, not plain text.
› There are two exceptions:
› (1) You can try using XHTML in which case you could put code in
› ‹![CDATA[ ]]› sections. This will work unless your code contains the
› string "]]›"
› (2) In regular HTML, some tags, like ‹xmp› have content that is not
› escaped. You can put your code in an ‹xmp› block instead of a ‹pre›
› block, but in that case your code can't contain the string "‹/xmp"
› case insensitively.

Thanks. Very informative.

This is informative to me because ‹![CDATA[ ]]› provides a workaround of the need to pre-process the text in xml/xhtml.

› › cause i was thinking, if people have to pre-process their source code,
› › even as trivial as replacing ‹ to <, prettify lost one of its major
› › attraction.

› It is almost that trivial. You have to replace & with & and then
› ‹ with <.

I understand. But to quibble a bit, the beauty of Google Code Prettify (GCP) for me was that the user don't need to pre-process the text. But if he has to encode even just few chars such as “‹ › &”, i feel this thwarts one main advantage of GCP because:

(1) not all programers are intimately familiar with HTML/XML's encoding/entities spec. Technically, what char needs to be encoded and in what situation is quite non-trivial. (e.g. “‹” don't need to be if there are spaces e.g. “x ‹ y” but needs to in “x‹y”.)

(2) in practice, almost all code will contain one or more of “‹ › &”. So this mean, pre-process must be done.

(3) For a seasoned programer, if he needs to pre-process the code, he might as well call a span based htmlizer script to automate the process.

(4) if the code needs to be preprocessed, one GCP advantage is lost because now the code can't be readily edited. One needs to undo the pre-processing before editing the code, then pre-process it again to display on the web. (e.g. change “<” back to “‹”, edit, then change “‹” back to “<”.)

(5) complications follows from above, because if your code is php, python, perl, html, css, these are among top 10 most used langs, most often they contain code to parse html/url or contain raw html/url itself, the encoding gets quite complex (e.g. the ampersand in url sometimes must be percent-encoded to %26. And if the code process html, your code are likely to have regex that tries to parse this sequence of chars “<”, so now you end up with ugliness such as &lt; to feed to GCP) If the code contains slightly complex regex, manual encoding/decoding is error prone.

› I agree that it's a burden, but it's an architectural limitation to
› doing it client side, unless you were to use an iframe to embed a
› plain text file. And in that case you'd be at the mercy of browser
› content-type sniffing.

› › PS do you have a write up somewhere about how the js coloring works?
› › it seems pretty magical. i.e. i don't think it actually replace the
› › lang code into span wrapped html.

› It should be better documented. I'm not quite sure I understand your
› question. Can you elaborate?

a silly off-topic question from me. I was wondering about the js technology used to implement the google code prettify.

e.g. suppose i have

a b c

and i want to use js to say make b red, how to do this in js?

the way know how is by having js replace the “a b c” with “a ‹span class="redclass"›b‹/span› c”. But that's apparently not how google code prettify works, or am i mistaken?

∑ http://xahlee.org/

No comments:

Post a Comment