Tag clouds and alphabetical ordering in non-alphabetical languages such as Chinese and Japanese

(via Silver Oliver who works at the BBC): Do tag clouds work in Chinese? (Chinese translation of this article). Rex Wong talks about the alphabetical ordering problems of non-alphabetical languages, problems sizing up Chinese fonts, and the mixing of Chinese characters and English words.

At the end of the day, though, tag clouds do seem to work fine in Chinese, as they do in Japanese. The alphabetical ordering algorythm is kind of irrelevant because tag clouds guide the eye mostly by size (a tag cloud is not used for known-item searching, where alphabetical ordering matters, but for serendipitous discovery).

Also, collation algorythms that order these languages in a “good enough” manner already exist and are built into the infrastructure we use (programming languages and databases), so you can probably safely forget about this.

I’ve been told that in Japan, tag clouds are pretty popular and work well.

Chinese tag cloud:

image

Japanese tag cloud:

image

How about tag clouds in other languages? Please point me to examples of tag clouds in as many languages as you can find, I want to try to make a collection.

Another issue of course is if you aggregate lots of content and tags, how do you deal with mixed-language tag clouds? Here’s an (older) example of mixed languages in a tagcloud:

image

The best approach seems to be to separate out tagclouds in different languages. There are automatic ways to do that (I have to research some more what the best ways are), and then present a tagcloud to the user in their language. (Some more thoughts on this.)

A note about alphabetical ordering of none alphabetical languages (like Chinese or Japanese): it often is possible to order these languages, using an alternative collation system called radical and stroke ordering. Below is a screenshot of an “alphabetical” index in Japanese.

alfaorder-japanese

And a question to finish this off: for our Japanese speakers: what is Taggy?

8 Responses to “Tag clouds and alphabetical ordering in non-alphabetical languages such as Chinese and Japanese”

  1. Peter Van Dijck’s Guide to Ease » Blog Archive Says:

    […] Related: I wrote a post on 290s.com about tagclouds in Chinese and Japanese. […]

  2. admin Says:

    Also, I’d love to hear about more examples of using tags in languages different than English, especially non-alphabetical and non-Latin alphabet languages. Leave them in the comments! :)

  3. admin Says:

    Here’s a Portuguese tagcloud:
    http://www.overmundo.com.br/home/nuvem_tags.php (via Gustavo from http://gawry.com)

  4. Wei Ding Says:

    Hi, Peter
    Just wanted to mention that the Chinese tag cloud above does not seem to be in any predictable order. There are ways to organize words/phrases in Chinese based on pronunciation or number of strokes. It would be more useful if they were in a particular order.

  5. Jonathan Says:

    “It would be more useful if they were in a particular order.” Why? The utility of a tag cloud is not to show order but to show frequency, is it not?

  6. Jonathan Says:

    My written Japanese is frankly awful, but Taggy.jp appears to be a “search engine that you don’t have to search” http://taggy.jp/help/whatis.html - primarily for multi-media, it seems. It lets you set up search profiles that act as “agents” to pull in content against tags you (and others) set: wanna see lots of funny videos? Just set the “funny” tag up and let the videos come to you!

  7. admin Says:

    Here’s a tagcloud in lolcatz: http://icanhascheezburger.com/all-tags/

  8. Matthijs Says:

    Jonathan, The utility of a tag cloud is to show frequency, displayed in alphabetical order ..

Leave a Reply


Get in touch:
petervandijck at gmail dot com | Skype ID: peterkevandijck
US: (+1) 201 467-5511 | Belgium: (+32) 03/325 88 70