`Web and documentation publishing Guide'
Sysadmin France
This document is for webmasters and users of FSF France machines. It may contain useful bits of informations outside this context.
--- The Detailed Node Listing ---
DocBook Quick Start
Web
Separate layout and content
Coposys
1 Introduction
In many aspects, the GNU way of doing things needs to be improved. When this is the case the idea is to coordinate with the GNU volunteers in charge of a specific aspect in order to improve the system. Doing so is a not as hard as contributing to a software. It may sometime be frustrating to stick to the GNU policy because it is quicker to get a nice result by inventing something completely new. The Web is probably the most striking example. However, since we claim to be part of the GNU project we don't really have the option to fork something of our own and leave the GNU project behind. Even if it means that we will go slower, coordination with and improvement of the GNU project as a whole should be our goal.
Close cooperation with the GNU project also makes some things a lot easier. The account creation process, the management of the HTML tree from CVS, the DNS hosting, mailing lists etc. are facilities we do not need to re-implement. When installing something entirely new on the GNU machines located in France, it is important to send a note on the relevant mailing lists in advance. This prevents possible duplication of the effort and greatly increases the chances that the GNU project will be able to re-use the new facility.
Reference documents describing the GNU environment are standards for webmaster information, fencepost.gnu.org:/gd/gnuorg/sysadmin/sysadmin.texi for system administration information. For usage of the project hosting facility, see Gna! documentation.
2 XML and interoperability
When publishing content or building a database for specific purposes (contact database, task lists, account information, permissions etc.) it is of the outermost importance to choose a data format that allows other programs to re-use the database and your own program to import databases from other programs.
Using XML for that purpose is only half of the answer. XML will merely prevent you to write a parser to read XML files and save you from the pitfall of inventing yet another data format. It says nothing about the semantic of the tags you decided to use. The harder part is to define the semantic associated with an XML file. For instance, DocBook uses XML but the core of the work was to define what each tag means and make sure it covers all possible needs. It tooks years and is not finished yet. Inventing a new semantic for a given purpose is not something that should be undertaken lightly.
In some cases there already exists a semantic that fits your needs. Even if it's not an established standard (RSS is a widely spread de-facto standard) it will save you the effort to invent something completely new. You may even be able to join the group that works on this semantic to enhance it if needs be. There are some DTD repositories that can help you find out if something exists:
- XML.com DTD
- RosettaNet
- hr-xml.org
- Schema.net
- XML.org
- Open Applications Group
- Translation Memory eXchange
- Extensible Log Format
When the data format is chosen, choosing or writing the programs that provide a given service is driven by this choice. If an application exists that does not support the format chosen, it's fairly reasonable to write a small program that translate the data format into the application specific data and vice versa. The advantages of doing this are:
- If a new, better program becomes available and you can switch to it using the chosen data format do dump/reload the data.
- If you contribute the export/import program to the developers of the program they are likely to accept it because it's based on a standard.
- If you want to display static HTML pages from the XML export, XSLT allows to do this in a fairly simple way.
3 DocBook Quick Start
When using a debian unstable distribution it is fairly easy to write and format DocBook documentation with few manual operations. However, one has to know the shortest path otherwise it quickly turns into a nightmare. An XSLT based (as opposed to DSSSL) set of packages and commands is proposed here, assuming you are running unstable. It will also works at the date of this writing (October 2001) if you're running testing but will require to install the packages recommended here from unstable.
3.1 DocBook packages
On GNU/Linux Debian unstable install the following packages:
apt-get install xsltproc apt-get install docbook-xsl apt-get install xmltexInstall passivetex as found at http://users.ox.ac.uk/~rahtz/passivetex/.
cd /usr/share/texmf/tex/latex/ mkdir passivetex cd passivetex unzip /tmp/passivetex.zip cd /usr/share/texmf/tex/xmltex/base pdftex -ini "&pdflatex" pdfxmltex.ini fmtutil --missing texlinks texhashFinally, the global memory size parameters of TeX must be increased. If formating fails, check for
exceedederrors and increase the corresponding parameter. Some sample values are available at http://users.ox.ac.uk/~rahtz/passivetex/.*** /etc/texmf/texmf.cnf.~1~ Sun Sep 16 23:06:18 2001 --- /etc/texmf/texmf.cnf Wed Oct 17 15:37:48 2001 *************** *** 398,409 **** % Extra space for the hash table of control sequences (which allows 10K % names as distributed). hash_extra.context = 25000 ! hash_extra = 0 % Max number of characters in all strings, including all error messages, % help texts, font names, control sequences. These values apply to TeX and MP. pool_size.context = 750000 ! pool_size = 125000 % Minimum pool space after TeX/MP's own strings; must be at least % 25000 less than pool_size, but doesn't need to be nearly that large. string_vacancies.context = 45000 --- 398,409 ---- % Extra space for the hash table of control sequences (which allows 10K % names as distributed). hash_extra.context = 25000 ! hash_extra = 50000 % Max number of characters in all strings, including all error messages, % help texts, font names, control sequences. These values apply to TeX and MP. pool_size.context = 750000 ! pool_size = 1250000 % Minimum pool space after TeX/MP's own strings; must be at least % 25000 less than pool_size, but doesn't need to be nearly that large. string_vacancies.context = 45000 *************** *** 442,448 **** param_size.context = 1500 param_size = 500 % simultaneous macro parameters save_size.context = 5000 ! save_size = 4000 % for saving values outside current group stack_size.context = 1500 stack_size = 300 % simultaneous input sources --- 442,448 ---- param_size.context = 1500 param_size = 500 % simultaneous macro parameters save_size.context = 5000 ! save_size = 10000 % for saving values outside current group stack_size.context = 1500 stack_size = 300 % simultaneous input sources3.2 DocBook formating
Formating DocBook in html or pdf can be done from the following Makefile sample.
all: sample.html sample.pdf clean: rm -f sample.aux sample.fo sample.log sample.out \ sample.html sample.pdf sample.html: sample.xml custom_html.xsl export SGML_CATALOG_FILES=/usr/lib/sgml/catalog ; \ xsltproc \ --catalogs \ -o sample.html \ custom_html.xsl sample.xml sample.pdf: sample.xml custom_fo.xsl export SGML_CATALOG_FILES=/usr/lib/sgml/catalog ; \ xsltproc \ --catalogs \ -o sample.fo \ custom_fo.xsl sample.xml pdflatex "&pdfxmltex" sample.fo4 Web
4.1 Overview
The FSF France web sites are stored in CVS on the Gna! machines. They can be edited from there and fsffrance.org has a local copy of all the pages that are updated three times per day. The primary purpose of the fsffrance.org machine is to host the FSF France web site, data and programs. However every web site related to the FSF France projects and friends of the FSF France are invited to use the machine for web hosting if they need it.
4.2 Virtual Hosts
The /etc/apache/httpd.conf was configured with virtual hosts. It handles the following domain names:
- `fsffrance.org'
- with a document root at /storage/www/www.fsffrance.org/htdocs.
All new domains should use a similar policy, that will help keeping things simple.
4.3 Editing the Web
All users registered in the FSF France www management project can edit the HTML repository using CVS/SSH. Instructions on how to do so can be found on Gna! The modified files will be updated on fsffrance.org within one day (normally less than that). The pages are stored in
.xhtmlfiles and formatted automatically to.htmlfiles. XHTML and XSLT Quick Start will tell you how to deal with this within seconds.4.4 Separate layout and content
It is convenient to separate the layout of an HTML page from the actual content. It prevents duplicating the menus in each page and allows to have a nice display without the burden of fixing it in every page when something needs fixing. Using SSI (Server Side Includes) is a solution to this problem but it has the disadvantage that it requires from web server displaying the pages to have this feature enabled, which is not always the case.
We chose to use an XSLT processor to separate the content from the layout. An XSLT processor provides something similar to a C preprocessor but is applied to XML files. The content of the pages are written in XHTML, which is a slightly different form of HTML that makes it XML compliant. The layout of the pages is stored in XSL files and the XSLT processor merges the two to produce an HTML page that will be used for display.
XSL file ----\ => XSLT processor ----> HTML file XHTML file ----/We chose to use XSL rather than any other method because it is a widely accepted standard and that Free Software tools are available everywhere that implement it. There are many possible uses of XSL but this chapter only focuses on using it to separate the layout of a web site from it's content.
4.4.1 XHTML and XSLT Quick Start
Assuming you're facing a directory that contains a mixture of
.xhtml,.htmland/or.xslfiles and you just want to contribute content without actually understanding what all this about, here is what you should do:
- Modify the
.xhtmlfile, not the generated.htmlfile.- If you want to create a new file, create it with the
.xhtmlextension. Copy a boilerplate.xhtml from somewhere or copy and modify a.xhtmlfile in the same directory.- To see the resulting HTML file, look for a Makefile in the directory or the upper directories that recursively walk the tree to format XHTML files into HTML files. If a grep
sabcmdorxsltprocon a Makefile finds something, you probably found the good one. Runmake alland look at the generated HTML file with your web browser.- To add a menu entry, edit the XSL file that ends with the
.xslextension. It should contain the menu entries: that's why they are created in the first place. Although the syntax of this file may not be familiar, the menu entries themselves are written in plain XHTML.- To re-use existing XSL files for your own web pages do the following
# # Make sure you have the sablotron XSLT processor # apt-get install sablotron # # Get the XSL files # wget http://fsffrance.org/navigation.fr.xsl wget http://fsffrance.org/fsfe-fr.xsl # # Get the Makefile # wget -O Makefile http://fsffrance.org/Makefile.sample # # Get the boilerplate # wget http://fsffrance.org/boilerplate.fr.xhtml # # edit fsfe-fr.xsl and remove all references to navigation.*.xsl other # than navigation.en.xsl. # # Build the boilerplate.fr.html file # make all # # View boilerplate.fr.html with a web browser and start customizing #If you are using an Red Hat distribution, to install sablotron, you need first install expat :
# get expat's RPM file ftp speakeasy.rpmfind.net user ftp pass cd /linux/C/redhat/7.1/en/os/i386/RedHat/RPMS get expat-1.95.1-1.i386.rpm # # install expat rpm -i expat-1.95.1-1.i386.rpm # # get sablotron wget http://www.gingerall.com/perl/rd?url=sablot/sablotron-0.52-1.i386.rpm # install sablotron rpm -i sablotron-0.52-1.i386.rpm #4.4.2 XSLT processors
There are many XSLT processors. We recommend the following two:
4.4.3 Writing XHTML content
The content of the web site is written in XHTML files that have the
.xhtmlextension. The XHTML looks like a regular HTML file, you don't have to learn anything new. Taking an existing HTML file and converting it to XHTML is simple. You only have to take care of the following:
- Add the following at the beginning of the file:
<?xml version="1.0" encoding="iso-8859-1" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">This declaration says that your file is in XHTML and that it contains
iso-8859-1characters.- Always close opening tags or the XSLT processor will bark. If you want to use a single tag, end it with a />.
<br /> <ul> <li> Item 1 </li> <li> Item 2 </li> </ul>- Wrap the content of the page using the proper tags.
<html> <head> <title>My Title</title> </head> <body> My stuff </body> </html>Web browsers may very well cope with the fact that you forget those but XSLT processors will not.
- Unlike html, tags and attribute names are case-sensitive and they are defined in
lower-case.<title> My Title </title>- Attribute values must be always inside quotes.
<a href="URL">What is it</a>4.4.4 XSL and generated HTML
XSL is not able to deal with HTML that does not look like XML (XHTML). Unfortunately most programs that generate HTML such as
makeinfoorheveado not generate XHTML but plain HTML. In this case including the generated HTML cannot be done with XSL. We suggest the following method:
- Add a SSI line to include the generated HTML file.
<!--#include virtual="sysadmin-france.en.html"-->- After running the XSLT processor, replace the SSI line with the content of the generated HTML file.
perl -MFile::Copy -pi -e \ '$| = 1; copy("$1", \*STDOUT) if(/\#include virtual=\"(.*?)\"/);' \ result.html4.4.5 Writing XSL files
If you're only interested in modifying the menus that are located in XSL files (those with the
.xslextension), you can consider that XSL files are just XHTML files and forget about the rest.To create your own XSL file without learning XSL, copy the XSL files of the FSF France web site and change them to your liking. The XHTML and XSLT Quick Start chapter will give you instructions to do this.
It takes a few days to get accustomed to XSLT. If you're familiar with m4 you'll find that easy. Otherwise it will look very strange because it looks like a programing language but is really a data transformation language.
A common pitfall when dealing with XSL is to try to use its most advanced features and this may lead to XSL files that are complex and very hard to understand and debug. Since XSL has almost all the features of a programming language (loops, variables, conditions, functions, etc.) but none of it's facilities (debugger, profiler etc.), complex XSL files will make your life a living hell. It may worth the effort (for DocBook transformation to HTML for instance) you have to carefully evaluate the pro and cons in advance.
A good XSL file is a file that even someone not accustomed to XSL could figure out, even if he won't be able to modify it's structure. Check the english menus of the FSF France web site for an example.
4.5 Publishing news
The fsfe/fr/news directory contains news articles. The structure of the directory is as follows:
- Each article is in a file named articleYYYY-MM-DD-NN.LG.xhtml where YYY-MM-DD is the date of publication, NN a two digit serial number if two or more news are published the same day, LG is the language code in lower case.
article2001-04-26-01.fr.xhtml- If an article contains sub-parts or pictures a directory named
articleYYYY-MM-DD-NNis created and all elements are added into it. No specific convention is to be followed in this sub-directory.- A RSS file must be built for each language with a file name of fsfe-fr-channel.LG.xml where LG is the language code in lower case. It's ok to use RSS-0.91 instead of RSS-1.0 since the later is backward compatible. Whenever an article is added in the news directory, a corresponding
itementry should be added to the RSS file corresponding to the language. The RSS DTD and necessary catalog file are available to enable users of Emacs/PSGML to produce a valid file.- An XSL file must be provided for each language with the name news.LG.xsl. This XSL file will be used to format the corresponding fsfe-fr-channel.LG.xml into a news-bytes.LG.xhtml file by the top level makefile. LG is the language code in lower case
fsfe/fr/Makefile is responsible of applying the fsfe/fr/news/news.LG.xsl files to the fsfe/fr/news/fsfe-fr-channel.LG.xml channels in order to produce the fsfe/fr/news/news-bytes.LG.xhtml. The fsfe/fr/index.fr.xhtml includes the news using the following:
<?xml version="1.0" encoding="iso-8859-1" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" [<!ENTITY news SYSTEM "news/news-bytes.fr.xhtml">]> ... &news; ...Other pages may do the same where appropriate.
If a remote site wants to print the news, the webmaster is free to use the RSS channels for this purpose. We should even encourage this. At present the following web sites read the RSS channels:
- SomeNews.org, contact Fabien Penso, plugin sources.
The RSS availables at present are:
- FSF France French
- FSF France English
The news are dispatched each day on a dedicated mailing-list (french, english). The process is not automated on FSF Europe's servers at the present time, but handled by Olivier Berger, with the rss2mail program (see the fsfenews-fr and fsfenews-en modules in rss2mail's CVS repository for more details).
4.6 Update from CVS
A
cvs updateis done three time per day in the /storage/www/www.fsffrance.org/htdocs directory by the /etc/cron.d/fsffrance crontab.This update does generate the HTML pages from the XHTML pages.
4.7 Immediate Update
If you ever want to update the web in no time you have to do the following:
ssh -l www fsffrance.org cd /storage/www/www.fsffrance.org/fsffrance cvs -z3 -q update -I "*.html" -d make4.8 Coposys
4.8.1 Intro
The tool Coposys is available on the
4.8.2 Configuration
Coposysitself was installed from the CVS tree with this configuration (see the coposys project homepage for further informations) :./configure --with-communities=fsf --with-markerdir=/var/www/coposys --with-mapdir=/var/www/www.fsfeurope.org/coposys/maps --with-cgidir=/var/www/www.fsfeurope.org/cgi-bin --with-owner=www --with-group=www4.8.3 Install
Several packages were installed with
apt-getfor Coposys :
- xbase-clients
- xplanet
- glutg3
- libnetpbm9
- libungif4g
- imagemagick
- libhdf4g
- libmagick5
- libwmf0
- netpbm
The Coposys package was installed with the following command:
make install.4.8.4 Apache configuration
The Apache configuration file (/etc/apache/httpd.conf) was modified to allow CGI execution. The following lines were added in the "www.fsfeurope.org" VirtualHost :
ScriptAlias /cgi-bin/ "/var/www/www.fsfeurope.org/cgi-bin/" <Directory "/var/www/www.fsfeurope.org/cgi-bin"> AllowOverride None Options ExecCGI Order allow,deny Allow from all </Directory>4.8.5 File permissions and owners
Every files and directories installed under
are owned by
- /var/www/www.fsfeurope.org/coposys/markers/
- /var/www/www.fsfeurope.org/coposys/maps/
- /var/www/www.fsfeurope.org/cgi-bin/
www:www.4.8.6 Crontab
One command has been added in the crontab of the user
www, it's purpose is to build the maps from the 'database' every 30 minutes if needed :MAILTO=cyril@bouthors.org */30 * * * * www nice -20 make -s -C /var/www/www.fsfeurope.org/coposys/maps/4.8.7 HTML design
One HTML page was made in order to describe and link to coposys goals and functionalities. It is available from the FSF Europe HTML repository using CVS/SSH in the /coposys/ directory.
5 Private Data
Most if not all the private data of the FSF France is handled by the FSF Europe Private Data project on Savannah. Its access is restricted to its members only, there is no anonymous access. If you need to have access to this repository write to contact@fsffrance.orgFSF France.
Index of Concepts
- .xhtml files editing: XHTML and XSLT Quick Start
- DocBook packages: DocBook packages
- DocBook usage: DocBook Quick Start
- DocBook using unstable or testing: DocBook Quick Start
- DSSSL not used for DocBook: DocBook Quick Start
- edit .xhtml files: XHTML and XSLT Quick Start
- edit fsffrance.org: Editing the Web
- fsfe project: Editing the Web
- fsffrance.org updates: Update from CVS
- libxslt: XSLT processors
- passivetex: DocBook packages
- re-use XSL example: Writing XSL files
- sabcmd: XSLT processors
- sablotron: XSLT processors
- SSI: Separate layout and content
- www.gnu.org: Update from CVS
- XHTML: Separate layout and content
- XHTML and XSLT quick start: XHTML and XSLT Quick Start
- XHTML rules: Writing XHTML content
- xmltex: DocBook packages
- XSL examples: Writing XSL files
- XSL files of your own: XHTML and XSLT Quick Start
- XSL learning curve: Writing XSL files
- XSL pitfall: Writing XSL files
- XSLT: Separate layout and content
- XSLT and DocBook: DocBook Quick Start
- XSLT and XHTML quick start: XHTML and XSLT Quick Start
- xsltproc: XSLT processors
- xsltproc: DocBook packages
- your own XSL files: XHTML and XSLT Quick Start
Index of File Names
Short Contents
Table of Contents

