Converting WordPress from Gengo to WPML – part 2

In this post I am going to investigate the structure of categories and tags in WordPress.

Categories and tags

In WordPress every post has one or more categories, and zero or more tags (more formally called `post-tags’). The categories are meant for a broad categorization of your posts and often are listed in a side bar. Usually a post is in a single category. Tags are more lightweight characterizations of your post and often a post has a whole bunch of them. These days many blogs have a tag cloud in the side bar, which is a less structured collection of the tags in use, often with the font size indicating the polularity of the tag. Internally in WordPress they are handle almost the same, however. Before WordPress 2.3 there were only categories, no tags. Categories could be attached to posts and links. Therefore in the database there were two tables: post2cat and link2cat with the obvious meaning. Categories in these tables were identified by numbers and there was a separate table connecting the numbers with the names and other properties of the categories. When tags were introduces in WordPress 2.3 the developers choose to not just add a new database table post2tag but to make the structure more general so that at a later time other similar concepts could be added without introducing new database tables. So now post categories, link categories, tags and future similar things are collectively called terms and terms are classified in taxonomies. For our purpose the two taxonomies category and post_tag are relevant. If a certain word is used both as a category and a tag, there is only a single term, but it occurs in two taxonomies. All these things are again represented as numbers in the database. As an example I have used the word `Wordpress’ both as a category and a tag. Let’s look how these are represented in the database. All this information can be found on the WordPress site, but I will give only a simplified version here.

First there is a table wp_term that contains information about the term, mapping the term number to the name and a slug, which is a clean form of it to be used in URLs. In my database `Wordpress’ is term number 19, but of course in your database it will probably be a different number if it would be present. Now there is a second table term_taxonomy that maps the taxonomies to the terms. Again in my database `Wordpress’ as a category is number 23 and as a tag it is 24. Then there is a table wp_term_relationships that couples the posts, pages, etc to the `taxonomied terms’. So my previous post called `Converting WordPress from Gengo to WPML’ which happens to have post number 275 has two entries tn this table: 275->23 for the category and 275->24 for the tag. From a multilingual point of view this example isn’t very exciting as `Wordpress’ wouldn’t have a separate translation in Dutch. Therefore another example so that we can see further on how translations are handled. I have a post about `Cary’s apple muffins‘ which has post number 123, and a Dutch translation `Cary’s appelcakejes‘ with post number 86. Both are in the category `Cooking’ (taxonomy number 31, term number 26). The English version has three tags: `apple’, `muffins’ and `cakes’; the dutch version only has a single tag `cake’. The taxonomy numbers of these are 34, 35, 36 and 32, and the term numbers are 29, 30, 31 and 27, respectively. Therefore we find this information in the database (information not relevant for this discussion omitted). The description column is not part of the database but is my description of the row.

Table wp_term_relationships
object_id term_taxonomy_id description
86 31 `Cary’s appelcakejes’: category `Cooking’
86 32 `Cary’s appelcakejes’: tag `cake’
123 31 `Cary’s apple muffins’: category `Cooking’
123 34 `Cary’s apple muffins’: tag `apple’
123 35 `Cary’s apple muffins’: tag `muffins’
123 36 `Cary’s apple muffins’: tag `cakes’
275 23 `Converting …’: category `Wordpress’
275 24 `Converting …’: tag `Wordpress’
Table wp_term_taxonomy
term_taxonomy_id term_id taxonomy description
31 26 category category `Cooking’
32 27 post_tag tag `cake’
34 29 post_tag tag `apple’
35 30 post_tag tag `muffins’
36 31 post_tag tag `cakes’
Table wp_terms
term_id name slug description
26 Cooking cooking category `Cooking’
27 cake cake tag `cake’
29 apple apple tag `apple’
30 muffins muffins tag `muffins’
31 cakes cakes tag `cakes’

You can convince yourself that these tables adequately describe the categories and tags of the posts mentioned in the text. Two remarks: (1) The prefix wp_ is the default but can be changed in the blog’s configuration file. (2) There is also a table wp_posts that contains the contents of all pages, posts, images, etc. It is indexed by the `post number’. In the next part we are going to look at the language information Gengo stores for categories and tags.