4suite.org Powered by 4Suite

4Suite FAQ

This version:
Revision 1.8 (2005-01-17)

Legal Notice

This document can be freely translated and distributed. It is released under the LDP License.

Abstract

This document attempts to answer frequently asked questions about 4Suite.

4Suite is a Python-based toolkit for XML and RDF application development. It features a library of integrated tools for XML processing, implementing open technologies such as DOM, RDF, XSLT, XInclude, XPointer, XLink, XPath, XUpdate, RELAX NG, and XML/SGML Catalogs. Layered upon this is an XML and RDF data repository and server, which supports multiple methods of data access, query, indexing, transformation, rich linking, and rule processing, and provides the data infrastructure of a full database system, including transactions, concurrency, access control, and management tools. It also supports HTTP (including SOAP and WebDAV), RPC, FTP, and CORBA.



1 General

1.1 What platforms are supported by 4Suite?

4Suite can be installed on Windows or POSIX (UNIX-like) platforms. Red Hat RPMs and Windows binary installers are available from 4suite.org, and 4Suite can be built from source on all other platforms. Some OS vendors provide their own 4Suite distribution packages.

4Suite is developed & tested on these platforms:

  • Linux (Red Hat, Debian and other common distros)

  • FreeBSD

  • Mac OS X

  • Cygwin

  • Solaris

  • Windows XP

  • Windows 2000

Other platforms may be able to run 4Suite, but support for them is not a priority for the developers.

The developers' testing generally consists of doing a clean build, install, repository init (FlatFile on Windows, Postgres or FlatFile on POSIX), a quick click-through of the repository HTTP demos, and running the test suites.

Please report porting and packaging problems to the 4Suite mailing list.

1.2 What packages are required to run 4Suite?

The main requirement is Python 2.2.1 or greater. PyXML is recommended if your Python installation does not have PyExpat (most do have it), or if you want to be able to parse XML with DTD validation, or if you want to be able to parse XML with support for XML or SGML Catalogs. AT&T GraphViz is required if you want to use the RDF graphing tool. For more details, see the 4Suite compatibility matrix and release timeline as well as the installation guide for the relevant platform.

1.3 Where's the documentation?

Complete documentation is a relatively low priority before the 1.0 gold release. There is enough to get you started, though:

  • Linked from the home page of the 4suite.org web site are installation and repository setup guides for the latest release version.

  • Uche's 4Suite Akara is a Wiki-like site containing many essential articles and notes that may end up in future documentation.

  • When building from source, you can generate fresh versions of the installation and setup guides, plus full API documentation, by appending --with-docs to your invocation of setup.py install. Or you could download a prebuilt version of these docs.

1.4 I am having weird import errors for, say, cDomlette or boolean

Make sure you are not running the tools from the source directory of 4Suite (i.e. the dir in which setup.py exists).

1.5 I installed an update or patch to 4Suite, and now I am having mysterious problems, or the old behavior, still.

If things seem amiss, often the best thing to do is clean house and start again. At a minimum, do the following:

  1. Remove the build directory in the source (it was created when you ran setup.py);

  2. Remove all the .pyc and .pyo files in the Python library directory where 4Suite is installed;

    For example, on UNIX, one might use some variation of this command: find /usr/local/lib/python2.2/site- packages/Ft -name '*.pyc' -or -name '*.pyo' -exec rm \{\} \;

  3. Run setup.py again.

1.6 What are all the command-line utilities?

The command-line utilities provide a convenient interface for using 4Suite's Python libraries from an interactive shell or shell script. Many users are content to use these tools so they don't ever have to write any Python code.

Run each tool with the --help option to get usage information.

  • 4xml - parses and reserializes XML from a file. Supports XInclude, RELAX NG, and, if PyXML is installed, DTD validation.

  • 4xpath - parses XML from a file and evaluates an XPath expression against it. The document's root node will be the context node.

  • 4xslt - performs XSLT processing. Supports XInclude (can be turned off), compiled stylesheets, stylesheet chaining, xml-stylesheet processing instruction, alternative include/import paths, and more.

  • 4xupdate - performs XUpdate processing.

  • 4rdf - reads and writes RDF data. Supports regular files as well as a 4Suite repository-like database. Supports these serialization formats: RDF XML, n-triples, W3C prolog, python list-of-tuples.

  • 4versa - reads RDF data and performs a Versa query against it. Supports regular files as well as a 4Suite repository-like database. If reading from a file, supports RDF xml and n-triples formats.

  • 4ss - manages files in the repository. There are about 40 subcommands. To see them, run 4ss --show-commands. Help is available at each level with the --help option, e.g. 4ss --help; 4ss create --help; 4ss create document --help.

  • 4ss_manager - administrates the repository itself, and its servers. As with 4ss, run 4ss --show-commands to see the various subcommands.

1.7 Does 4XSLT support <?xml-stylesheet?> processing instructions?

Yes. The processing instruction (PI) must appear in the prolog, must contain an RFC 3023 compliant type pseudo-attribute, and must not have a media pseudo-attribute that differs from the preferred value, which you can set. If everything checks out, the PI's href pseudo-attribute will be resolved to find an associated stylesheet. Recognized values for type are 'application/xslt+xml', 'application/xslt', 'application/xml', and 'text/xml'. Microsoft's nonstandard 'text/xsl' is not supported by default.

If more than one <?xml-stylesheet?> PI is given, it has the effect of the first stylesheet doing an xsl:import of the next, so the stylesheet referenced by each PI will have a higher import precedence than the one for the PI that comes next.

If any <?xml-stylesheet?> PIs bearing alternate="yes" pseudo-attributes are present, they will be considered, along with the first PI without alternate="yes", as possible candidates for the first stylesheet in the import tree. If there is more than one candidate, the first non-alternate will be chosen, or if there are only alternates, the first alternate will be chosen.

1.8 What is the best way to submit 4Suite patches and bug reports?

The best way is to join the 4Suite mailing list and post your report there. That way, the problem can be discussed and resolved in a public forum, which will be of benefit to other users of 4Suite. The developers try to respond to every question posted. It's also a good way to make sure your problem hasn't already been fixed.

Advanced users with SourceForge accounts and who know what constitutes a good bug report can submit directly to the 4Suite bug tracker on SourceForge.

1.9 Where can users get support for 4Suite?

Subscribe and post to the 4Suite mailing list. This list is of moderate traffic, peaking at about 40 - 60 messages a week, but usually being much lighter. There is also a moderated list for 4Suite-related announcements, which generally only gets a handful of posts per year.

1.10 How can I get updates of 4Suite between major releases?

See the 4Suite CVS information page.

1.11 What XML character encodings are supported by 4Suite?

On the input side, the XML and stylesheet readers wrap Expat, which natively supports UTF-16, UTF-16LE, UTF-16BE, UTF-8, ISO-8859-1, and US-ASCII. Other encodings are supported if they are available through Python's codecs.lookup() and meet certain criteria established by Expat: they must be single-byte encodings (so UTF-32, UCS-4, UCS-2 are ruled out), must encode certain characters the same as ASCII (so EBCDIC is ruled out), must represent each character with a distinct series of bytes (so stateful encodings like ISO-2022-x, EUC-JP, and GB2312 are ruled out), and must not represent characters beyond the Basic Multilingual Plane (another reason UCS-4 and UTF-32 wouldn't work). Encodings like KOI8-R, the 8-bit IBM and Microsoft codepages, and the rest of the ISO-8859-x series should work. 4Suite's readers also support external declaration of a document's encoding.

On the output side, more encodings are supported. The XML & HTML domlette printers and the XSLT output methods support any encoding available through Python's codecs.lookup().

1.12 There are several DOM implementations in Python, PyXML and 4Suite. Which should I use?

Minidom (xml.dom.minidom) comes with Python 2.x and is a very effective, lightweight implementation. 4DOM (xml.dom.*) comes with PyXML and is a full-blown DOM library designed for maximum compliance and features. For maximum speed and simplicity, you might want to try the XPath-oriented "Domlette" that 4Suite provides, which operates at C speed. See this page and this one from Uche Ogbuji's Python/XML Akara site for more discussion. See this page for compliance and feature comparisons.

1.13 Why doesn't Domlette have getElementsByTagName()?

The DOM API has various imperfections and C++/Java-isms, some of which we tried to address by using more Pythonic APIs in Domlette. In the case of getElementsByTagName(), the popularity of this method is attributable to its being the only native interface for locating data across the whole node tree. With the advent of XPath, though, and the focus on keeping Domlette as lightweight as possible, this method begins to look rather clumsy and unnecessary. At the very least, it doesn't belong in the core API. If you really want it, consider putting one of these in your code:


# simple but not efficient
from Ft.Xml.XPath import Evaluate
def GetElementsByTagName(node, name):
    return Evaluate(".//" + name, contextNode=node)


# very fast and namespace-aware, but only works in Python 2.2 and up
from __future__ import generators
def doc_order_iterator_filter(node, filter_func):
    if filter_func(node):
        yield node
    for child in node.childNodes:
        for cn in doc_order_iterator_filter(child, filter_func):
            if filter_func(cn):
                yield cn
    return

def get_elements_by_tag_name_ns(node, ns, local):
    return doc_order_iterator_filter(node, lambda n:
      n.nodeType == Node.ELEMENT_NODE and
      n.namespaceURI == ns and n.localName == local)

1.14 Why do I get codec or other strange errors when running under mod_python?

There is a known problem with mod_python in Python 2.2 and older. mod_python is running in restricted mode, and this is considered by the mod_python folks to be a bug. Anything which introspects (like 4Suite) or does anything a restricted interpreter can't do will fail with a cryptic exception. The solution is to upgrade to Python 2.3, and use a recent build of mod_python (3.1 or newer).

There is an important entry in the mod_python FAQ that explains issues related to rebuilding older versions of mod_python under Python 2.3.

Some versions of 4Suite also had an issue with byte order inversion in XSLT output when running under mod_python, resulting in garbage output. Again, the recommendation is to make sure you're using current versions of everything.

2 Repository-Related Questions

2.1 Can 4Suite act as a Web Server for my XML documents?

Yes, if you load the documents into the repository. With the HTTP server feature of the repository, 4Suite can listen for a variety of HTTP requests, which you can customize. These can be as simple as a Web browser request, or as complex as SOAP. You have full access to 4Suite's XML processing facilities in tandem with the HTTP server. 4Suite does come with several demos illustrating this. To get started, make sure you follow the instructions to set up the repository in the Quick Start guide.

2.2 I try to do a 4ss_manager init and get an "Invalid Login" error after a few other normal messages.

Make sure you don't have an instance of the repository server already running: execute "4ss_manager stop"

2.3 I am tired of entering my username and password over and over again when running repository commands

Use the "4ss agent" command to spawn a command shell where you don't need to log-in each time. If you need to run these commands from scripts, you should use 4ss login as the user running the script, and then specify the username on the command line using the --username=<username> option to the desired command.

Home |  Copyright |  Downloads
Comments about the site? Email the webmaster.