MediaWiki Documentation

Here you can find the full developer API for the mediawiki project.

Functions and Classes

MediaWiki

class mediawiki.MediaWiki(url: str = 'https://{lang}.wikipedia.org/w/api.php', lang: str = 'en', timeout: float = 15.0, rate_limit: bool = False, rate_limit_wait: timedelta = datetime.timedelta(microseconds=50000), cat_prefix: str = 'Category', user_agent: str | None = None, username: str | None = None, password: str | None = None, proxies: Dict | None = None, verify_ssl: bool | str = True)[source]

MediaWiki API Wrapper Instance

Parameters:
  • url (str) – API URL of the MediaWiki site; defaults to Wikipedia

  • lang (str) – Language of the MediaWiki site; used to help change API URL

  • timeout (float) – HTTP timeout setting; None means no timeout

  • rate_limit (bool) – Use rate limiting to limit calls to the site

  • rate_limit_wait (timedelta) – Amount of time to wait between requests

  • cat_prefix (str) – The prefix for categories used by the mediawiki site; defaults to Category (en)

  • user_agent (str) – The user agent string to use when making requests; defaults to a library version but per the MediaWiki API documentation it recommends setting a unique one and not using the library’s default user-agent string

  • username (str) – The username to use to log into the MediaWiki

  • password (str) – The password to use to log into the MediaWiki

  • proxies (str) – A dictionary of specific proxies to use in the Requests libary

  • verify_ssl (bool|str) – Verify SSL Certificates to be passed directly into the Requests library

login(username, password)[source]

Login as specified user

Parameters:
  • username (str) – The username to log in with

  • password (str) – The password for the user

  • strict (bool) – True to throw an error on failure

Returns:

True if successfully logged in; False otherwise

Return type:

bool

Raises:

mediawiki.exceptions.MediaWikiLoginError – if unable to login

Note

Per the MediaWiki API, one should use the bot password; see https://www.mediawiki.org/wiki/API:Login for more information

suggest(query)[source]

Gather suggestions based on the provided title or None if no suggestions found

Parameters:

query (str) – Page title

Returns:

Suggested page title or None if no suggestion found

Return type:

String or None

search(query, results=10, suggestion=False)[source]

Search for similar titles

Parameters:
  • query (str) – Page title

  • results (int) – Number of pages to return

  • suggestion (bool) – Use suggestion

Returns:

tuple (list results, suggestion) if suggestion is True; list of results otherwise

Return type:

tuple or list

Note

Could add ability to continue past the limit of 500

allpages(query='', results=10)[source]

Request all pages from mediawiki instance

Parameters:
  • query (str) – Search string to use for pulling pages

  • results (int) – The number of pages to return

Returns:

The pages that meet the search query

Return type:

list

Note

Could add ability to continue past the limit of 500

summary(title, sentences=0, chars=0, auto_suggest=True, redirect=True)[source]

Get the summary for the title in question

Parameters:
  • title (str) – Page title to summarize

  • sentences (int) – Number of sentences to return in summary

  • chars (int) – Number of characters to return in summary

  • auto_suggest (bool) – Run auto-suggest on title before summarizing

  • redirect (bool) – Use page redirect on title before summarizing

Returns:

The summarized results of the page

Return type:

str

Note

Precedence for parameters: sentences then chars; if both are 0 then the entire first section is returned

Note

If the page doesn’t match the provided title, try setting auto_suggest to False

geosearch(latitude=None, longitude=None, radius=1000, title=None, auto_suggest=True, results=10)[source]

Search for pages that relate to the provided geocoords or near the page

Parameters:
  • latitude (Decimal or None) – Latitude geocoord; must be coercible to decimal

  • longitude (Decimal or None) – Longitude geocoord; must be coercible to decimal

  • radius (int) – Radius around page or geocoords to pull back; in meters

  • title (str) – Page title to use as a geocoordinate; this has precedence over lat/long

  • auto_suggest (bool) – Auto-suggest the page title

  • results (int) – Number of pages within the radius to return

Returns:

A listing of page titles

Return type:

list

Note

The Geosearch API does not support pulling more than the maximum of 500

Note

If the page doesn’t match the provided title, try setting auto_suggest to False

Raises:

ValueError – If either the passed latitude or longitude are not coercible to a Decimal

prefixsearch(prefix, results=10)[source]

Perform a prefix search using the provided prefix string

Parameters:
  • prefix (str) – Prefix string to use for search

  • results (int) – Number of pages with the prefix to return

Returns:

List of page titles

Return type:

list

Note

Per the documentation: “The purpose of this module is similar to action=opensearch: to take user input and provide the best-matching titles. Depending on the search engine backend, this might include typo correction, redirect avoidance, or other heuristics.”

Note

Could add ability to continue past the limit of 500

opensearch(query, results=10, redirect=True)[source]

Execute a MediaWiki opensearch request, similar to search box suggestions and conforming to the OpenSearch specification

Parameters:
  • query (str) – Title to search for

  • results (int) – Number of pages within the radius to return

  • redirect (bool) – If False return the redirect itself, otherwise resolve redirects

Returns:

List of results that are stored in a tuple (Title, Summary, URL)

Return type:

List

Note

The Opensearch API does not support pulling more than the maximum of 500

Raises:

categorymembers(category, results=10, subcategories=True)[source]

Get information about a category: pages and subcategories

Parameters:
  • category (str) – Category name

  • results (int) – Number of result

  • subcategories (bool) – Include subcategories (True) or not (False)

Returns:

Either a tuple ([pages], [subcategories]) or just the list of pages

Return type:

Tuple or List

Note

Set results to None to get all results

categorytree(category: str, depth: int = 5) Dict[str, Any][source]

Generate the Category Tree for the given categories

Parameters:
  • category (str or list of strings) – Category name(s)

  • depth (int) – Depth to traverse the tree

Returns:

Category tree structure

Return type:

dict

Note

Set depth to None to get the whole tree

Note

Return Data Structure: Subcategory contains the same recursive structure

>>> {
        'category': {
            'depth': Number,
            'links': list,
            'parent-categories': list,
            'sub-categories': dict
        }
    }

New in version 0.3.10.

clear_memoized()[source]

Clear memoized (cached) values

page(title=None, pageid=None, auto_suggest=True, redirect=True, preload=False)[source]

Get MediaWiki page based on the provided title or pageid

Parameters:
  • title (str) – Page title

  • pageid (int) – MediaWiki page identifier

  • auto-suggest (bool) – True: Allow page title auto-suggest

  • redirect (bool) – True: Follow page redirects

  • preload (bool) – True: Load most page properties

Raises:

Note

Title takes precedence over pageid if both are provided

Note

If the page doesn’t match the provided title, try setting auto_suggest to False

random(pages: int = 1) str | List[str][source]

Request a random page title or list of random titles

Parameters:

pages (int) – Number of random pages to return

Returns:

A list of random page titles or a random page title if pages = 1

Return type:

list or int

set_api_url(api_url: str = 'https://{lang}.wikipedia.org/w/api.php', lang: str = 'en', username: str | None = None, password: str | None = None)[source]

Set the API URL and language

Parameters:
  • api_url (str) – API URL to use

  • lang (str) – Language of the API URL

  • username (str) – The username, if needed, to log into the MediaWiki site

  • password (str) – The password, if needed, to log into the MediaWiki site

Raises:

mediawiki.exceptions.MediaWikiAPIURLError – if the url is not a valid MediaWiki site or login fails

wiki_request(params: Dict[str, Any]) Dict[Any, Any][source]

Make a request to the MediaWiki API using the given search parameters

Parameters:

params (dict) – Request parameters

Returns:

A parsed dict of the JSON response

Note

Useful when wanting to query the MediaWiki site for some value that is not part of the wrapper API

property api_url: str

API URL of the MediaWiki site

Note

Not settable; See mediawiki.MediaWiki.set_api_url()

Type:

str

property api_version: str | None

API Version of the MediaWiki site

Note

Not settable

Type:

str

property extensions: List[str]

Extensions installed on the MediaWiki site

Note

Not settable

Type:

list

property language: str

The API URL language, if possible this will update the API URL

Note

Use correct language titles with the updated API URL

Note

Some API URLs do not encode language; unable to update if this is the case

Type:

str

property memoized: Dict[Any, Any]

Return the memoize cache

Note

Not settable; see mediawiki.MediaWiki.clear_memoized()

Type:

dict

property rate_limit: bool

Turn on or off Rate Limiting

Type:

bool

property rate_limit_min_wait: timedelta

Time to wait between calls

Note

Only used if rate_limit is True

Type:

timedelta

property refresh_interval: int | None

The interval at which the memoize cache is to be refresh

Type:

int

property supported_languages: Dict[str, str]

All supported language prefixes on the MediaWiki site

Note

Not Settable

Type:

dict

property timeout: float | None

Response timeout for API requests

Note

Use None for no response timeout

Type:

float

property user_agent: str

User agent string

Note: If using in as part of another project, this should be changed

Type:

str

property version: str

The version of the pymediawiki library

Note

Not settable

Type:

str

MediaWikiPage

class mediawiki.MediaWikiPage(mediawiki, title: str | None = None, pageid: int | None = None, redirect: bool = True, preload: bool = False, original_title: str = '')[source]

MediaWiki Page Instance

Parameters:
  • mediawiki (MediaWiki) – MediaWiki class object from which to pull

  • title (str) – Title of page to retrieve

  • pageid (int) – MediaWiki site pageid to retrieve

  • redirect (bool) – True: Follow redirects

  • preload (bool) – True: Load most properties after getting page

  • original_title (str) – Not to be used from the caller; used to help follow redirects

Raises:

Warning

This should never need to be used directly! Please use mediawiki.MediaWiki.page()

Parse all links within a section

Parameters:

section_title (str) – Name of the section to pull or, if None is provided, the links between the main heading and the first section

Returns:

List of (title, url) tuples

Return type:

list

Note

Use None to pull the links from the header section

Note

Returns None if section title is not found

Note

Side effect is to also pull the html which can be slow

Note

This is a parsing operation and not part of the standard API

section(section_title: str | None) str | None[source]

Plain text section content

Parameters:

section_title (str) – Name of the section to pull or None for the header section

Returns:

The content of the section

Return type:

str

Note

Use None if the header section is desired

Note

Returns None if section title is not found; only text between title and next section or sub-section title is returned

Note

Side effect is to also pull the content which can be slow

Note

This is a parsing operation and not part of the standard API

summarize(sentences: int = 0, chars: int = 0) str[source]

Summarize page either by number of sentences, chars, or first section (default)

Parameters:
  • sentences (int) – Number of sentences to use in summary (first x sentences)

  • chars (int) – Number of characters to use in summary (first x characters)

Returns:

The summary of the MediaWiki page

Return type:

str

Note

Precedence for parameters: sentences then chars; if both are 0 then the entire first section is returned

Pages that link to this page

Note

Not settable

Type:

list

property categories: List[str]

Non-hidden categories on the page

Note

Not settable

Type:

list

property content: str

The page content in text format

Note

Not settable

Note

Side effect is to also get revision_id and parent_id

Type:

str

property coordinates: Tuple[Decimal, Decimal] | None

GeoCoordinates of the place referenced; results in lat/long tuple or None if no geocoordinates present

Note

Not settable

Note

Requires the GeoData extension to be installed

Type:

Tuple

property hatnotes: List[str]

Parse hatnotes from the HTML

Note

Not settable

Note

Side effect is to also pull the html which can be slow

Note

This is a parsing operation and not part of the standard API

Type:

list

property html: str

HTML representation of the page

Note

Not settable

Warning

This can be slow for very large pages

Type:

str

property images: List[str]

Images on the page

Note

Not settable

Type:

list

Names of the page in other languages for which page is where the key is the language code and the page name is the name of the page in that language.

Note

Not settable

Note

list of all language links from the provided pages to other languages according to: https://www.mediawiki.org/wiki/API:Langlinks

Type:

dict

List of all MediaWiki page links on the page

Note

Not settable

Type:

list

property logos: List[str]

Parse images within the infobox signifying either the main image or logo

Note

Not settable

Note

Side effect is to also pull the html which can be slow

Note

This is a parsing operation and not part of the standard API

Type:

list

property parent_id: int

The parent id of the page

Note

Not settable

Note

Side effect is to also get content and revision_id

Type:

int

property preview: Dict[str, str]

Page preview information that builds the preview hover

Type:

dict

property redirects: List[str]

List of all redirects to this page; i.e., the titles listed here will redirect to this page title

Note

Not settable

Type:

list

property references: List[str]

External links, or references, listed anywhere on the MediaWiki page .. note:: Not settable

Note

May include external links within page that are not technically cited anywhere

Type:

list

property revision_id: int

The current revision id of the page

Note

Not settable

Note

Side effect is to also get content and parent_id

Type:

int

property sections: List[str]

Table of contents sections

Note

Not settable

Type:

list

property summary: str | None

Default page summary

Note

Not settable

Type:

str

property table_of_contents: Dict[str, Any]

Dictionary of sections and sub-sections

Note

Leaf nodes are empty OrderedDict objects

Note

Not Settable

Type:

OrderedDict

property wikitext: str

Wikitext representation of the page

Note

Not settable

Type:

str

Exceptions

MediaWiki Exceptions

exception mediawiki.exceptions.DisambiguationError(title: str, may_refer_to: List[str], url: str, details: List[Dict] | None = None)[source]

Exception raised when a page resolves to a Disambiguation page

Parameters:
  • title (str) – Title that resulted in a disambiguation page

  • may_refer_to (list) – List of possible titles

  • url (str) – Full URL to the disambiguation page

  • details (dict) – A list of dictionaries with more information of possible results

Note

options only includes titles that link to valid MediaWiki pages

property details: List[Dict] | None

The details of the proposed non-disambigous pages

Type:

list

property options: List[str]

The list of possible page titles

Type:

list

property title: str

The title of the page

Type:

str

property unordered_options: List[str]

The list of possible page titles, un-sorted in an attempt to get them as they showup on the page

Type:

list

property url: str

The url, if possible, of the disambiguation page

Type:

str

exception mediawiki.exceptions.HTTPTimeoutError(query: str)[source]

Exception raised when a request to the Mediawiki site times out.

Parameters:

query (str) – The query that timed out

property query: str

The query that timed out

Type:

str

exception mediawiki.exceptions.MediaWikiAPIURLError(api_url: str)[source]

Exception raised when the MediaWiki server does not support the API

Parameters:

api_url (str) – The API URL that was not recognized

property api_url: str

The api url that raised the exception

Type:

str

exception mediawiki.exceptions.MediaWikiBaseException(message: str)[source]

Base MediaWikiException

Parameters:

message – The message of the exception

property message: str

The MediaWiki exception message

Type:

str

exception mediawiki.exceptions.MediaWikiCategoryTreeError(category: str)[source]

Exception when the category tree is unable to complete for an unknown reason

Parameters:

category (str) – The category that threw an exception

property category: str

The category that threw an exception during category tree generation

Type:

str

exception mediawiki.exceptions.MediaWikiException(error: str)[source]

MediaWiki Exception Class

Parameters:

error (str) – The error message that the MediaWiki site returned

property error: str

The error message that the MediaWiki site returned

Type:

str

exception mediawiki.exceptions.MediaWikiGeoCoordError(error: str)[source]

Exceptions to handle GeoData exceptions

Parameters:

error (str) – Error message from the MediaWiki site related to GeoCoordinates

property error: str

The error that was thrown when pulling GeoCoordinates

Type:

str

exception mediawiki.exceptions.MediaWikiLoginError(error: str)[source]

Exception raised when unable to login to the MediaWiki site

Parameters:

error (str) – The error message that the MediaWiki site returned

property error: str

The error message that the MediaWiki site returned

Type:

str

exception mediawiki.exceptions.PageError(title: str | None = None, pageid: int | None = None)[source]

Exception raised when no MediaWiki page matched a query

Parameters:
  • title (str) – Title of the page

  • pageid (int) – MediaWiki page id of the page

property pageid: int

The page id that caused the page error

Type:

int

property title: str

The title that caused the page error

Type:

str

exception mediawiki.exceptions.RedirectError(title: str)[source]

Exception raised when a page title unexpectedly resolves to a redirect

Parameters:

title (str) – Title of the page that redirected

Note

This should only occur if both auto_suggest and redirect are set to False

property title: str

The title that was redirected

Type:

str

Indices and tables