`Web and documentation publishing Guide'

Next: , Previous: (dir), Up: (dir)

Sysadmin France

This document is for webmasters and users of FSF France machines. It may contain useful bits of informations outside this context.

--- The Detailed Node Listing ---

DocBook Quick Start


Separate layout and content


Next: , Previous: Top, Up: Top

1 Introduction

In many aspects, the GNU way of doing things needs to be improved. When this is the case the idea is to coordinate with the GNU volunteers in charge of a specific aspect in order to improve the system. Doing so is a not as hard as contributing to a software. It may sometime be frustrating to stick to the GNU policy because it is quicker to get a nice result by inventing something completely new. The Web is probably the most striking example. However, since we claim to be part of the GNU project we don't really have the option to fork something of our own and leave the GNU project behind. Even if it means that we will go slower, coordination with and improvement of the GNU project as a whole should be our goal.

Close cooperation with the GNU project also makes some things a lot easier. The account creation process, the management of the HTML tree from CVS, the DNS hosting, mailing lists etc. are facilities we do not need to re-implement. When installing something entirely new on the GNU machines located in France, it is important to send a note on the relevant mailing lists in advance. This prevents possible duplication of the effort and greatly increases the chances that the GNU project will be able to re-use the new facility.

Reference documents describing the GNU environment are standards for webmaster information, fencepost.gnu.org:/gd/gnuorg/sysadmin/sysadmin.texi for system administration information. For usage of the project hosting facility, see Gna! documentation.

Next: , Previous: Introduction, Up: Top

2 XML and interoperability

When publishing content or building a database for specific purposes (contact database, task lists, account information, permissions etc.) it is of the outermost importance to choose a data format that allows other programs to re-use the database and your own program to import databases from other programs.

Using XML for that purpose is only half of the answer. XML will merely prevent you to write a parser to read XML files and save you from the pitfall of inventing yet another data format. It says nothing about the semantic of the tags you decided to use. The harder part is to define the semantic associated with an XML file. For instance, DocBook uses XML but the core of the work was to define what each tag means and make sure it covers all possible needs. It tooks years and is not finished yet. Inventing a new semantic for a given purpose is not something that should be undertaken lightly.

In some cases there already exists a semantic that fits your needs. Even if it's not an established standard (RSS is a widely spread de-facto standard) it will save you the effort to invent something completely new. You may even be able to join the group that works on this semantic to enhance it if needs be. There are some DTD repositories that can help you find out if something exists:

When the data format is chosen, choosing or writing the programs that provide a given service is driven by this choice. If an application exists that does not support the format chosen, it's fairly reasonable to write a small program that translate the data format into the application specific data and vice versa. The advantages of doing this are:

Next: , Previous: XML and interoperability, Up: Top

3 DocBook Quick Start

When using a debian unstable distribution it is fairly easy to write and format DocBook documentation with few manual operations. However, one has to know the shortest path otherwise it quickly turns into a nightmare. An XSLT based (as opposed to DSSSL) set of packages and commands is proposed here, assuming you are running unstable. It will also works at the date of this writing (October 2001) if you're running testing but will require to install the packages recommended here from unstable.

Next: , Previous: DocBook Quick Start, Up: DocBook Quick Start

3.1 DocBook packages

On GNU/Linux Debian unstable install the following packages:

     apt-get install xsltproc
     apt-get install docbook-xsl
     apt-get install xmltex

Install passivetex as found at http://users.ox.ac.uk/~rahtz/passivetex/.

     cd /usr/share/texmf/tex/latex/
     mkdir passivetex
     cd passivetex
     unzip /tmp/passivetex.zip
     cd /usr/share/texmf/tex/xmltex/base
     pdftex -ini "&pdflatex" pdfxmltex.ini
     fmtutil --missing

Finally, the global memory size parameters of TeX must be increased. If formating fails, check for exceeded errors and increase the corresponding parameter. Some sample values are available at http://users.ox.ac.uk/~rahtz/passivetex/.

     *** /etc/texmf/texmf.cnf.~1~	Sun Sep 16 23:06:18 2001
     --- /etc/texmf/texmf.cnf	Wed Oct 17 15:37:48 2001
     *** 398,409 ****
       % Extra space for the hash table of control sequences (which allows 10K
       % names as distributed).
       hash_extra.context = 25000
     ! hash_extra = 0
       % Max number of characters in all strings, including all error messages,
       % help texts, font names, control sequences.  These values apply to TeX and MP.
       pool_size.context = 750000
     ! pool_size = 125000
       % Minimum pool space after TeX/MP's own strings; must be at least
       % 25000 less than pool_size, but doesn't need to be nearly that large.
       string_vacancies.context = 45000
     --- 398,409 ----
       % Extra space for the hash table of control sequences (which allows 10K
       % names as distributed).
       hash_extra.context = 25000
     ! hash_extra = 50000
       % Max number of characters in all strings, including all error messages,
       % help texts, font names, control sequences.  These values apply to TeX and MP.
       pool_size.context = 750000
     ! pool_size = 1250000
       % Minimum pool space after TeX/MP's own strings; must be at least
       % 25000 less than pool_size, but doesn't need to be nearly that large.
       string_vacancies.context = 45000
     *** 442,448 ****
       param_size.context = 1500
       param_size = 500	% simultaneous macro parameters
       save_size.context = 5000
     ! save_size = 4000	% for saving values outside current group
       stack_size.context = 1500
       stack_size = 300	% simultaneous input sources
     --- 442,448 ----
       param_size.context = 1500
       param_size = 500	% simultaneous macro parameters
       save_size.context = 5000
     ! save_size = 10000	% for saving values outside current group
       stack_size.context = 1500
       stack_size = 300	% simultaneous input sources

Previous: DocBook packages, Up: DocBook Quick Start

3.2 DocBook formating

Formating DocBook in html or pdf can be done from the following Makefile sample.

     all: sample.html sample.pdf
             rm -f sample.aux sample.fo sample.log sample.out \
                   sample.html sample.pdf
     sample.html: sample.xml custom_html.xsl
             export SGML_CATALOG_FILES=/usr/lib/sgml/catalog ; \
             xsltproc \
             --catalogs \
             -o sample.html \
             custom_html.xsl sample.xml
     sample.pdf: sample.xml custom_fo.xsl
             export SGML_CATALOG_FILES=/usr/lib/sgml/catalog ; \
             xsltproc \
             --catalogs \
             -o sample.fo \
             custom_fo.xsl sample.xml
             pdflatex "&pdfxmltex" sample.fo

Next: , Previous: DocBook Quick Start, Up: Top

4 Web

Next: , Previous: Web, Up: Web

4.1 Overview

The FSF France web sites are stored in CVS on the Gna! machines. They can be edited from there and fsffrance.org has a local copy of all the pages that are updated three times per day. The primary purpose of the fsffrance.org machine is to host the FSF France web site, data and programs. However every web site related to the FSF France projects and friends of the FSF France are invited to use the machine for web hosting if they need it.

Next: , Previous: Overview, Up: Web

4.2 Virtual Hosts

The /etc/apache/httpd.conf was configured with virtual hosts. It handles the following domain names:

with a document root at /storage/www/www.fsffrance.org/htdocs.

All new domains should use a similar policy, that will help keeping things simple.

Next: , Previous: Virtual Hosts, Up: Web

4.3 Editing the Web

All users registered in the FSF France www management project can edit the HTML repository using CVS/SSH. Instructions on how to do so can be found on Gna! The modified files will be updated on fsffrance.org within one day (normally less than that). The pages are stored in .xhtml files and formatted automatically to .html files. XHTML and XSLT Quick Start will tell you how to deal with this within seconds.

Next: , Previous: Editing the Web, Up: Web

4.4 Separate layout and content

It is convenient to separate the layout of an HTML page from the actual content. It prevents duplicating the menus in each page and allows to have a nice display without the burden of fixing it in every page when something needs fixing. Using SSI (Server Side Includes) is a solution to this problem but it has the disadvantage that it requires from web server displaying the pages to have this feature enabled, which is not always the case.

We chose to use an XSLT processor to separate the content from the layout. An XSLT processor provides something similar to a C preprocessor but is applied to XML files. The content of the pages are written in XHTML, which is a slightly different form of HTML that makes it XML compliant. The layout of the pages is stored in XSL files and the XSLT processor merges the two to produce an HTML page that will be used for display.

     XSL file    ----\
                      => XSLT processor ----> HTML file
     XHTML file  ----/

We chose to use XSL rather than any other method because it is a widely accepted standard and that Free Software tools are available everywhere that implement it. There are many possible uses of XSL but this chapter only focuses on using it to separate the layout of a web site from it's content.

Next: , Previous: Separate layout and content, Up: Separate layout and content

4.4.1 XHTML and XSLT Quick Start

Assuming you're facing a directory that contains a mixture of .xhtml, .html and/or .xsl files and you just want to contribute content without actually understanding what all this about, here is what you should do:

Next: , Previous: XHTML and XSLT Quick Start, Up: Separate layout and content

4.4.2 XSLT processors

There are many XSLT processors. We recommend the following two:

Next: , Previous: XSLT processors, Up: Separate layout and content

4.4.3 Writing XHTML content

The content of the web site is written in XHTML files that have the .xhtml extension. The XHTML looks like a regular HTML file, you don't have to learn anything new. Taking an existing HTML file and converting it to XHTML is simple. You only have to take care of the following:

Next: , Previous: Writing XHTML content, Up: Separate layout and content

4.4.4 XSL and generated HTML

XSL is not able to deal with HTML that does not look like XML (XHTML). Unfortunately most programs that generate HTML such as makeinfo or hevea do not generate XHTML but plain HTML. In this case including the generated HTML cannot be done with XSL. We suggest the following method:

Previous: XSL and generated HTML, Up: Separate layout and content

4.4.5 Writing XSL files

If you're only interested in modifying the menus that are located in XSL files (those with the .xsl extension), you can consider that XSL files are just XHTML files and forget about the rest.

To create your own XSL file without learning XSL, copy the XSL files of the FSF France web site and change them to your liking. The XHTML and XSLT Quick Start chapter will give you instructions to do this.

It takes a few days to get accustomed to XSLT. If you're familiar with m4 you'll find that easy. Otherwise it will look very strange because it looks like a programing language but is really a data transformation language.

A common pitfall when dealing with XSL is to try to use its most advanced features and this may lead to XSL files that are complex and very hard to understand and debug. Since XSL has almost all the features of a programming language (loops, variables, conditions, functions, etc.) but none of it's facilities (debugger, profiler etc.), complex XSL files will make your life a living hell. It may worth the effort (for DocBook transformation to HTML for instance) you have to carefully evaluate the pro and cons in advance.

A good XSL file is a file that even someone not accustomed to XSL could figure out, even if he won't be able to modify it's structure. Check the english menus of the FSF France web site for an example.

Next: , Previous: Separate layout and content, Up: Web

4.5 Publishing news

The fsfe/fr/news directory contains news articles. The structure of the directory is as follows:

fsfe/fr/Makefile is responsible of applying the fsfe/fr/news/news.LG.xsl files to the fsfe/fr/news/fsfe-fr-channel.LG.xml channels in order to produce the fsfe/fr/news/news-bytes.LG.xhtml. The fsfe/fr/index.fr.xhtml includes the news using the following:

     <?xml version="1.0" encoding="iso-8859-1" ?>
     <!DOCTYPE html
               PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
               [<!ENTITY news SYSTEM "news/news-bytes.fr.xhtml">]>

Other pages may do the same where appropriate.

If a remote site wants to print the news, the webmaster is free to use the RSS channels for this purpose. We should even encourage this. At present the following web sites read the RSS channels:

The RSS availables at present are:

The news are dispatched each day on a dedicated mailing-list (french, english). The process is not automated on FSF Europe's servers at the present time, but handled by Olivier Berger, with the rss2mail program (see the fsfenews-fr and fsfenews-en modules in rss2mail's CVS repository for more details).

Next: , Previous: Publishing news, Up: Web

4.6 Update from CVS

A cvs update is done three time per day in the /storage/www/www.fsffrance.org/htdocs directory by the /etc/cron.d/fsffrance crontab.

This update does generate the HTML pages from the XHTML pages.

Next: , Previous: Update from CVS, Up: Web

4.7 Immediate Update

If you ever want to update the web in no time you have to do the following:

     ssh -l www fsffrance.org
     cd /storage/www/www.fsffrance.org/fsffrance
     cvs -z3 -q update -I "*.html" -d

Previous: Immediate Update, Up: Web

4.8 Coposys

Next: , Previous: Coposys, Up: Coposys

4.8.1 Intro

The tool Coposys is available on the

FSFE site

Next: , Previous: Intro, Up: Coposys

4.8.2 Configuration

Coposys itself was installed from the CVS tree with this configuration (see the coposys project homepage for further informations) :

     ./configure --with-communities=fsf

Next: , Previous: Configuration, Up: Coposys

4.8.3 Install

Several packages were installed with apt-get for Coposys :

The Coposys package was installed with the following command: make install.

Next: , Previous: Install, Up: Coposys

4.8.4 Apache configuration

The Apache configuration file (/etc/apache/httpd.conf) was modified to allow CGI execution. The following lines were added in the "www.fsfeurope.org" VirtualHost :

     ScriptAlias /cgi-bin/ "/var/www/www.fsfeurope.org/cgi-bin/"
     <Directory "/var/www/www.fsfeurope.org/cgi-bin">
             AllowOverride None
             Options ExecCGI
             Order allow,deny
             Allow from all

Next: , Previous: Apache configuration, Up: Coposys

4.8.5 File permissions and owners

Every files and directories installed under

are owned by www:www.

Next: , Previous: File permissions and owners, Up: Coposys

4.8.6 Crontab

One command has been added in the crontab of the user www, it's purpose is to build the maps from the 'database' every 30 minutes if needed :

      */30 * * * * www nice -20 make -s -C /var/www/www.fsfeurope.org/coposys/maps/

Previous: Crontab, Up: Coposys

4.8.7 HTML design

One HTML page was made in order to describe and link to coposys goals and functionalities. It is available from the FSF Europe HTML repository using CVS/SSH in the /coposys/ directory.

Next: , Previous: Web, Up: Top

5 Private Data

Most if not all the private data of the FSF France is handled by the FSF Europe Private Data project on Savannah. Its access is restricted to its members only, there is no anonymous access. If you need to have access to this repository write to bonjour@fsffrance.orgFSF France.

Next: , Previous: Private Data, Up: Top

Index of Concepts

Previous: Concept Index, Up: Top

Index of File Names

Short Contents

Table of Contents