.. -*- mode: rst; coding: utf-8 -*-

~~~~~~~~~~~~~~~~~~~~~~~~~
mwlib Configuration Files
~~~~~~~~~~~~~~~~~~~~~~~~~

Configuration files are in a `INI-like format`_. They can contain the sections

``[wiki]``

  Description of how to retrieve MediaWiki articles and templates and meta
  information about the MediaWiki like name or URL.

and

``[images]``

  Description of how to retrieve images.

Both sections must contain a ``type`` setting, on which all other available
settings in that section depend.

``type=mwapi``
==============

Retrieve articles and images via the `MediaWiki API`_, i.e. via ``api.php``.

This requires MediaWiki 1.11 or later.

Specific Options
----------------

``base_url=BASEURL``

  The base URL of the MediaWiki. See the ``--config`` option in
  ``docs/commands.txt``.
  
``template_blacklist=ARTICLE``

  A title of a page in the MediaWiki which contains blacklisted templates.
  Example for such a template blacklist page::

    * [[Template:SkipMe]]
    * [[Template:MeToo]]

``template_exclusion_category=CATEGORY``

  A name of a category: Templates in this category are excluded during
  rendering.

``username=USERNAME``, ``password=PASSWORD``, ``domain=DOMAIN``

  Login with the given credentials (domain is optional).
  

``type=zip``
============

Read articles and images from a ZIP file, generated with ``mw-zip``.

``path=ZIPFILE``

  Path to ZIP file.


``type=cdb``
============

Read articles from a CDB files, generated with ``mw-buildcdb``.

Only allowed in ``[wiki]`` section.

Specific Options
----------------

``path=DIR``

  Directory with CDB files.

``type=net``
============

Fetch articles via HTTP for MediaWikis without the `MediaWiki API`_.

Only allowed in ``[wiki]`` section.

Note that this type of configuration has a few disadvantages like the ''many''
configuration values and the fact that it can't find out the authors of
articles. If your MediaWiki has the `MediaWiki API`_ enabled, please consider
using ``type=mwapi`` instead!

Specific Options
----------------

``url=URL``

  URL of this MediaWiki. This is not used to retrieve articles, but to identify
  the MediaWiki. For example http://en.wiki.pedia.org/ could be used for the
  English Wikipedia.

``name=NAME``

  Name of this MediaWiki, e.g. "English Wikipedia".

``articleurl=PATTERN``

  A pattern used to construct URLs of articles in raw wikitext format from a
  title and a revision. The string "@TITLE@" is replaced by the title and the
  string "@REVISION@" by the revision.
  
  For example, this is the correct pattern for the English Wikipedia::
  
    http://en.wikipedia.org/w/index.php?title=@TITLE@&oldid=@REVISION@&action=raw

``imagedescriptionurls=PATTERNS``

  A space separated list of URL patterns to image description pages. The string
  @TITLE@ is replaced with the image title without the prefix (e.g. without
  "Image:"). The listed URLs are tried from left to right.
  
  For example, these are the correct patterns for the English Wikipedia::
  
    http://commons.wikimedia.org/w/index.php?title=Image:@TITLE@s&action=raw http://en.wikipedia.org/w/index.php?title=Image:@TITLE@s&action=raw

``templateurls=PATTERNS``

  A space separated list of URL patterns to template pages. The string @TITLE@
  is replaced by the name of the template without the prefix (e.g. without
  "Template:"). The listed URLs are tried from left to right. This allows to
  override Templates that come later in the list.
  
  For example this is the correct pattern for the English Wikipedia::
  
    http://en.wikipedia.org/w/index.php?title=@TITLE@&action=raw
  
``templateblacklist=TITLE``

  A title of a page in the MediaWiki which contains blacklisted templates.
  Example for such a template blacklist page::

    * [[Template:SkipMe]]
    * [[Template:MeToo]]

``defaultauthors=TEXT``

  A list of "authors" that get returned as principal article authors for
  every article. Set this to some general string (e.g. "Wikpedians" on a
  Wikipedia) if the authors for different articles can vary.


``type=download``
=================

Fetch images via HTTP for MediaWikis that don't support the `MediaWiki API`_.

Only allowed in the ``[images]`` section.

As with ``type=net`` for the ``[wiki]`` section, consider using ``type=mwapi``
instead.

Specific Options
----------------

``url=BASEURLS``

  A space separated list of "base URLs" for images, i.e. URLs up to the
  hash-path for images. For example these would be the correct list of base URLs
  for the English Wikipedia::
  
     http://upload.wikimedia.org/wikipedia/commons/ http://upload.wikimedia.org/wikipedia/en/

``localpath=DIR``

  If given, fetched images are cached to this directory.

``knownlicenses=LICENSES``

  A space separated list of know license names that can occur on image
  description pages in form of included templates. E.g. "GFDL cc-by-sa".


.. _`INI-like format`: http://docs.python.org/lib/module-ConfigParser.html
.. _`MediaWiki API`: http://www.mediawiki.org/wiki/API
