CWB R3 code architecture

This document is a rough guideline to the architecture of the CWB source code.

The CL library

CL stands for "Corpus Library". This library contains the most basic functions for CWB and CQP. Things in the CL only depend on other things in the CL.

Note that the "dependencies" given below are based on which modules #include each others' headers. Some, e.g. attributes.c and cdaccess.c, are mutually ependent in this sense.

cl/cl.h

cl/attributes.h ; cl/attributes.c

cl/binsert.h ; cl/binsert.c

cl/bitfields.h ; cl/bitfields.c

cl/bitio.h ; cl/bitio.c

cl/cdaccess.h ; cl/cdaccess.c

cl/class-mapping.h ; cl/class-mapping.c

cl/compression.h ; cl/compression.c

cl/corpus.h ; cl/corpus.c

cl/dl_stub.c

cl/endian.h ; cl/endian.c

cl/fileutils.h ; cl/fileutils.c

cl/globals.h ; cl/globals.c

cl/lexhash.h ; cl/lexhash.c

cl/list.h ; cl/list.c

cl/macros.h ; cl/macros.c

cl/makecomps.h ; cl/makecomps.c

cl/registry.l ; cl/registry.y

cl/registry.tab.h

cl/regopt.h ; cl/regopt.c

cl/special-chars.h ; cl/special-chars.c

cl/storage.h ; cl/storage.c

cl/Makefile

CQi - Corpus Query interface

This is the "cqpserver" program and some modules that it depends on.

CQi/CQi.h

This file #defines all the CQI_* constants; there are no function prototypes or data structures here.

This part of CWB depends (a) on the CL library and (b) on CQP.

CQi/cqpserver.c

CQi/auth.h ; CQi/auth.c

CQi/server.h ; CQi/server.c

CQP (query processor and interactive environment)

Dependencies in this directory on the CL are not noted unless especially relevant. Basically everything here depends on the CL one way or another. Also, interdependencies between different cqp modules are not noted.

cqp/ascii-print.c ; cqp/ascii-print.h

cqp/attlist.c ; cqp/attlist.h

cqp/builtins.c ; cqp/builtins.h

cqp/concordance.c ; cqp/concordance.h

cqp/context_descriptor.c ; cqp/context_descriptor.h

cqp/corpmanag.c ; cqp/corpmanag.h

cqp/cqp.c ; cqp/cqp.h

cqp/cqpcl.c

cqp/dummy_auth.c

cqp/eval.c ; cqp/eval.h

cqp/groups.c ; cqp/groups.h

cqp/hash.c ; cqp/hash.h

cqp/html-print.c ; cqp/html-print.h

cqp/latex-print.c ; cqp/latex-print.h

cqp/llquery.c

cqp/macro.c ; cqp/macro.h

cqp/matchlist.c ; cqp/matchlist.h

cqp/options.c ; cqp/options.h

cqp/output.c ; cqp/output.h

cqp/parser.l ; cqp/parser.y

cqp/parse_actions.c ; cqp/parse_actions.h

cqp/paths.c ; cqp/paths.h

cqp/print-modes.c ; cqp/print-modes.h

cqp/print_align.c ; cqp/print_align.h

cqp/ranges.c ; cqp/ranges.h

cqp/regex2dfa.c ; cqp/regex2dfa.h

cqp/sgml-print.c ; cqp/sgml-print.h

cqp/symtab.c ; cqp/symtab.h

cqp/table.c ; cqp/table.h

cqp/targets.c ; cqp/targets.h

cqp/tree.c ; cqp/tree.h

cqp/treemacros.h

cqp/variables.c ; cqp/variables.h

cqp/Makefile

Command-line utilities

Most of these files contain the code for a single program, each of which is one of the non-interactive components of CWB. These files do not usually have headers - the functions in them are for that program alone.

These utilities are used most importantly for corpus setup but also for a range of administration tasks.

As a general rule, the utilities depend on the CL library. Most of them #include cl/cl.h but some #include other headers from the CL library.

utils/barlib.c ; utils/barlib.h

utils/feature_maps.c ; utils/feature_maps.h

utils/cwb-align-encode.c

utils/cwb-align-show.c

utils/cwb-align.c

utils/cwb-atoi.c

utils/cwb-compress-rdx.c

utils/cwb-decode-nqrfile.c

utils/cwb-decode.c

utils/cwb-describe-corpus.c

utils/cwb-encode.c

utils/cwb-huffcode.c

utils/cwb-itoa.c

utils/cwb-lexdecode.c

utils/cwb-makeall.c

utils/cwb-s-decode.c

utils/cwb-s-encode.c

utils/cwb-scan-corpus.c

utils/Makefile

Other directories within the CWB root directory

config

The subdirectories here contain chunks of makefile for use when compiling CWB on different operating systems.

doc

This contains documentation of the CWB code (note: not user documentation for CWB/CQP), including this file!

editline

This contains a (slightly patched) version of the Editline library, on which CQP is dependent.

The file editline/README documents why this is included in the CWB source tree.

instutils

This directory contains shell scripts (sh) for configuring / installing CWB.

man

This contains the *.pod source files for the man entries for cqp and the CWB command-line utilties.

Global variables in CL

(This is just an idea --- useful? Or overkill? -- AH)

NameTypeDefined inDeclared extern inWhat is it?
@@@@@@@@
@@@@@@@@
@@@@@@@@
@@@@@@@@
@@@@@@@@
@@@@@@@@

Global variables in CQP

(This is just an idea --- useful? Or overkill? -- AH)

NameTypeDefined inDeclared extern inWhat is it?
@@@@@@@@
@@@@@@@@
@@@@@@@@
@@@@@@@@
@@@@@@@@
@@@@@@@@