Jump to content

COMPLETED (33208) - [BUG] HTML validator broken/Unicode


dfx

Recommended Posts

As it was thoroughly ignored over there, here's a dedicated thread, so it can be ignored even better:

 

Entering the following into a cache listing:

 

<foo>test</foo>
<foo>test</foo>
<foo>test</foo> [line butchered by the forum, see here for the raw text: http://pastebin.com/2TmR5UVe ]
&amp;lt;foo&amp;gt;test&amp;lt;/foo&amp;gt;

 

instantly gets translated into this upon submission:

 

test test
<foo>test</foo>
<foo>test</foo>

 

So, you have to triple escape the HTML entities to make them show up. Additionally, double escaping allows you to inject disallowed HTML code (such as <foo>). Submitting the form a second time would put the code through a second round of butchering and mess it up even more.

Link to comment

Hi, there were many threads about that, and promises about fixing that. I'm talking about some characters, what are not available in cache description, like "ąęśćńźżł" and probably many more, what are in use in some European alphabets. This problem was fixed in TB descriptions, but not in cache descriptions. Is any chance to fix that? What is the technical problem with that?

I'll be grateful for any answer from Groundspeak Lackeys.

This is the one of reasons why people in some countries prefer local caching than geocaching.com website...

Link to comment

The current problem is already explained elsewhere. You have to use unicode HTML entities, such as & #261; (without the space) for the letter ą. However, the HTML validator is currently screwed up, and you have to enter the entity as &&#261; instead, and make sure you save/submit it only once (it will get messed up again if you save more than once).

 

So far, the problem has been barely acknowledged let alone fixed, despite the fact that it allows you to inject unallowed HTML code into your listings...

Edited by dfx
Link to comment

hello Groundspeak!

 

I'd like to renew the really old topic on stripping the national characters out of the cache descriptions when creating or editing cache listings.

 

by saying "really old" I mean the issue was first reported 6 YEARS ago here and then 2 YEARS later here.

 

both topics are now closed and the last Groundspeak news was you were aware of the issue, were sorry for that and were planning to fix this.

that was on August 2009...

 

today we are another 2.5 YEARS LATER but the bug seems to be still there.

 

could you please clarify what's the current status on this? have anything been done? is it still being planned? or maybe declined?

 

I'm from Poland and when posting any new cache listing I try to use any of Polish 9 (or 18 with case) diacritic characters - unfortunately with poor result. The only Polish character that gets accepted is "ó" - that's sad..

 

I guess the problem not only applies to Polish but to any other non-English / non-American language speakers around the world who would like to use their native language and characters in cache listings. and since geocaching.com site is totally global now I think we deserve more attention and more effort in solving this global issue too.

 

really looking forward to hear from you soon some good news on this!

 

kind regards!

Link to comment

[...]we will be doing a hot-fix this afternoon.

-Raine

Apparently this broke my cache listing which included HTML-entity encoded japanese characters. When saving the page, the japanese characters get decoded, and get shown in the edit box. In the cache listing, there are only "?". Now, since this is a mystery and part of the hints are in the japanese text, this is really bad. Please fix immediately.

Link to comment

The current problem is described here: http://forums.Groundspeak.com/GC/index.php?showtopic=288981

well, I wouldn't say the HTML validator is the root cause of the problem.

 

In short, cache pages are in unicode, but the edit page entry box doesn't accept UTF-8.

so this is the real problem here. but as OpinioNate said here Groundspeak was planning to replace this non-UTF text box already in 2009..

 

so is it really still the case??

 

can we hear some official news from Groundspeak team please?

 

The workaround is to encode the unicode characters as HTML entities, but with the broken HTML validator you have to encode them thrice.

well, I can't imagine how do you explain that to some non-technical person (as probably 99% of geocachers are) that instead of simply write their national characters they need to generate some black magic HTML entities stuff and find/replace it in their cache descriptions before posting..

 

IMHO this is not an option at all.

Link to comment

I have made a cache listing, and wanted to include Korean.

Usually we converted the Korean into UNICODE, and then include it in the listing.

The result should be Korean characters in cache listings.

However, the UNICODE has broken and turned into just "????".

Maybe recent upgrades of the site cause it.

Please check it for me.

Link to comment

Please address this bug as soon as possible.

 

4 of the 13 puzzle caches in my published Lost Cities puzzle series rely on Unicode to support the foreign language text of their puzzles. Upon revising the web pages today to remove a temporary advisory, the Unicode was stripped to display the foreign language text of their puzzles as all "???"

 

"Lost Cities" is very popular series and it is a shame that cachers won't be able to pursue it due to these technical problems.

 

This is the second bug from recent upgrades that has corrupted my cache pages. From now on, every time that there is a Geocaching.com site update, I'll cringe at the thought of what might happen to my cache pages.

Edited by Lati.dude
Link to comment

Until Groundspeak fixes this you can display the foreign characters as an image. It might not be readable on some paperless GPS devices, but then again they probably couldn't display Unicode characters in the first place.

Thanks for the suggestion Ambient Skater.

 

Unfortunately, the key to solving my puzzles that rely on foreign language Unicode is to highlight and copy the foreign characters on a PC or Android, then paste them into Google Translate or run a direct Google search. That is why I used Unicode in the first place. Pretty hard to transcribe them from an image into the appropriate foreign language unless you have the right keyboard. :wacko:

 

I tried re-coding the html for my Unicode characters per the suggestion from dfx above (e.g. "&&#261;"), but without success. Anyone have a work around until Groundspeak fixes this bug?

 

Luckily, I saved a copies of the original html for all my pages prior to the 1/17/2012 update. The new Groundspeak html warning should read: To prevent lost data, we strongly recommend you save an offline copy of your cache description before submitting to the website or making any revisions thereto.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...