MediaWiki Documentation
Here you can find the full developer API for the mediawiki project.
Functions and Classes
MediaWiki
- class mediawiki.MediaWiki(url: str = 'https://{lang}.wikipedia.org/w/api.php', lang: str = 'en', timeout: float = 15.0, rate_limit: bool = False, rate_limit_wait: timedelta = datetime.timedelta(microseconds=50000), cat_prefix: str = 'Category', user_agent: str | None = None, username: str | None = None, password: str | None = None, proxies: Dict | None = None, verify_ssl: bool | str = True)[source]
MediaWiki API Wrapper Instance
- Parameters:
url (str) – API URL of the MediaWiki site; defaults to Wikipedia
lang (str) – Language of the MediaWiki site; used to help change API URL
timeout (float) – HTTP timeout setting; None means no timeout
rate_limit (bool) – Use rate limiting to limit calls to the site
rate_limit_wait (timedelta) – Amount of time to wait between requests
cat_prefix (str) – The prefix for categories used by the mediawiki site; defaults to Category (en)
user_agent (str) – The user agent string to use when making requests; defaults to a library version but per the MediaWiki API documentation it recommends setting a unique one and not using the library’s default user-agent string
username (str) – The username to use to log into the MediaWiki
password (str) – The password to use to log into the MediaWiki
proxies (str) – A dictionary of specific proxies to use in the Requests libary
verify_ssl (bool|str) – Verify SSL Certificates to be passed directly into the Requests library
- login(username, password)[source]
Login as specified user
- Parameters:
username (str) – The username to log in with
password (str) – The password for the user
strict (bool) – True to throw an error on failure
- Returns:
True if successfully logged in; False otherwise
- Return type:
bool
- Raises:
mediawiki.exceptions.MediaWikiLoginError – if unable to login
Note
Per the MediaWiki API, one should use the bot password; see https://www.mediawiki.org/wiki/API:Login for more information
- suggest(query)[source]
Gather suggestions based on the provided title or None if no suggestions found
- Parameters:
query (str) – Page title
- Returns:
Suggested page title or None if no suggestion found
- Return type:
String or None
- search(query, results=10, suggestion=False)[source]
Search for similar titles
- Parameters:
query (str) – Page title
results (int) – Number of pages to return
suggestion (bool) – Use suggestion
- Returns:
tuple (list results, suggestion) if suggestion is True; list of results otherwise
- Return type:
tuple or list
Note
Could add ability to continue past the limit of 500
- allpages(query='', results=10)[source]
Request all pages from mediawiki instance
- Parameters:
query (str) – Search string to use for pulling pages
results (int) – The number of pages to return
- Returns:
The pages that meet the search query
- Return type:
list
Note
Could add ability to continue past the limit of 500
- summary(title, sentences=0, chars=0, auto_suggest=True, redirect=True)[source]
Get the summary for the title in question
- Parameters:
title (str) – Page title to summarize
sentences (int) – Number of sentences to return in summary
chars (int) – Number of characters to return in summary
auto_suggest (bool) – Run auto-suggest on title before summarizing
redirect (bool) – Use page redirect on title before summarizing
- Returns:
The summarized results of the page
- Return type:
str
Note
Precedence for parameters: sentences then chars; if both are 0 then the entire first section is returned
Note
If the page doesn’t match the provided title, try setting auto_suggest to False
- geosearch(latitude=None, longitude=None, radius=1000, title=None, auto_suggest=True, results=10)[source]
Search for pages that relate to the provided geocoords or near the page
- Parameters:
latitude (Decimal or None) – Latitude geocoord; must be coercible to decimal
longitude (Decimal or None) – Longitude geocoord; must be coercible to decimal
radius (int) – Radius around page or geocoords to pull back; in meters
title (str) – Page title to use as a geocoordinate; this has precedence over lat/long
auto_suggest (bool) – Auto-suggest the page title
results (int) – Number of pages within the radius to return
- Returns:
A listing of page titles
- Return type:
list
Note
The Geosearch API does not support pulling more than the maximum of 500
Note
If the page doesn’t match the provided title, try setting auto_suggest to False
- Raises:
ValueError – If either the passed latitude or longitude are not coercible to a Decimal
- prefixsearch(prefix, results=10)[source]
Perform a prefix search using the provided prefix string
- Parameters:
prefix (str) – Prefix string to use for search
results (int) – Number of pages with the prefix to return
- Returns:
List of page titles
- Return type:
list
Note
Per the documentation: “The purpose of this module is similar to action=opensearch: to take user input and provide the best-matching titles. Depending on the search engine backend, this might include typo correction, redirect avoidance, or other heuristics.”
Note
Could add ability to continue past the limit of 500
- opensearch(query, results=10, redirect=True)[source]
Execute a MediaWiki opensearch request, similar to search box suggestions and conforming to the OpenSearch specification
- Parameters:
query (str) – Title to search for
results (int) – Number of pages within the radius to return
redirect (bool) – If False return the redirect itself, otherwise resolve redirects
- Returns:
List of results that are stored in a tuple (Title, Summary, URL)
- Return type:
List
Note
The Opensearch API does not support pulling more than the maximum of 500
Raises:
- categorymembers(category, results=10, subcategories=True)[source]
Get information about a category: pages and subcategories
- Parameters:
category (str) – Category name
results (int) – Number of result
subcategories (bool) – Include subcategories (True) or not (False)
- Returns:
Either a tuple ([pages], [subcategories]) or just the list of pages
- Return type:
Tuple or List
Note
Set results to None to get all results
- categorytree(category: str, depth: int = 5) Dict[str, Any] [source]
Generate the Category Tree for the given categories
- Parameters:
category (str or list of strings) – Category name(s)
depth (int) – Depth to traverse the tree
- Returns:
Category tree structure
- Return type:
dict
Note
Set depth to None to get the whole tree
Note
Return Data Structure: Subcategory contains the same recursive structure
>>> { 'category': { 'depth': Number, 'links': list, 'parent-categories': list, 'sub-categories': dict } }
New in version 0.3.10.
- page(title=None, pageid=None, auto_suggest=True, redirect=True, preload=False)[source]
Get MediaWiki page based on the provided title or pageid
- Parameters:
title (str) – Page title
pageid (int) – MediaWiki page identifier
auto-suggest (bool) – True: Allow page title auto-suggest
redirect (bool) – True: Follow page redirects
preload (bool) – True: Load most page properties
- Raises:
ValueError – when title is blank or None and no pageid is provided
mediawiki.exceptions.PageError – if page does not exist
Note
Title takes precedence over pageid if both are provided
Note
If the page doesn’t match the provided title, try setting auto_suggest to False
- random(pages: int = 1) str | List[str] [source]
Request a random page title or list of random titles
- Parameters:
pages (int) – Number of random pages to return
- Returns:
A list of random page titles or a random page title if pages = 1
- Return type:
list or int
- set_api_url(api_url: str = 'https://{lang}.wikipedia.org/w/api.php', lang: str = 'en', username: str | None = None, password: str | None = None)[source]
Set the API URL and language
- Parameters:
api_url (str) – API URL to use
lang (str) – Language of the API URL
username (str) – The username, if needed, to log into the MediaWiki site
password (str) – The password, if needed, to log into the MediaWiki site
- Raises:
mediawiki.exceptions.MediaWikiAPIURLError – if the url is not a valid MediaWiki site or login fails
- wiki_request(params: Dict[str, Any]) Dict[Any, Any] [source]
Make a request to the MediaWiki API using the given search parameters
- Parameters:
params (dict) – Request parameters
- Returns:
A parsed dict of the JSON response
Note
Useful when wanting to query the MediaWiki site for some value that is not part of the wrapper API
- property api_url: str
API URL of the MediaWiki site
Note
Not settable; See
mediawiki.MediaWiki.set_api_url()
- Type:
str
- property api_version: str | None
API Version of the MediaWiki site
Note
Not settable
- Type:
str
- property extensions: List[str]
Extensions installed on the MediaWiki site
Note
Not settable
- Type:
list
- property language: str
The API URL language, if possible this will update the API URL
Note
Use correct language titles with the updated API URL
Note
Some API URLs do not encode language; unable to update if this is the case
- Type:
str
- property memoized: Dict[Any, Any]
Return the memoize cache
Note
Not settable; see
mediawiki.MediaWiki.clear_memoized()
- Type:
dict
- property rate_limit: bool
Turn on or off Rate Limiting
- Type:
bool
- property rate_limit_min_wait: timedelta
Time to wait between calls
Note
Only used if rate_limit is True
- Type:
timedelta
- property refresh_interval: int | None
The interval at which the memoize cache is to be refresh
- Type:
int
- property supported_languages: Dict[str, str]
All supported language prefixes on the MediaWiki site
Note
Not Settable
- Type:
dict
- property timeout: float | None
Response timeout for API requests
Note
Use None for no response timeout
- Type:
float
- property user_agent: str
User agent string
Note: If using in as part of another project, this should be changed
- Type:
str
- property version: str
The version of the pymediawiki library
Note
Not settable
- Type:
str
MediaWikiPage
- class mediawiki.MediaWikiPage(mediawiki, title: str | None = None, pageid: int | None = None, redirect: bool = True, preload: bool = False, original_title: str = '')[source]
MediaWiki Page Instance
- Parameters:
mediawiki (MediaWiki) – MediaWiki class object from which to pull
title (str) – Title of page to retrieve
pageid (int) – MediaWiki site pageid to retrieve
redirect (bool) – True: Follow redirects
preload (bool) – True: Load most properties after getting page
original_title (str) – Not to be used from the caller; used to help follow redirects
- Raises:
mediawiki.exceptions.PageError – if page provided does not exist
mediawiki.exceptions.DisambiguationError – if page provided is a disambiguation page
mediawiki.exceptions.RedirectError – if redirect is False and the pageid or title provided redirects to another page
Warning
This should never need to be used directly! Please use
mediawiki.MediaWiki.page()
- parse_section_links(section_title: str) List[Tuple[str, str]] | None [source]
Parse all links within a section
- Parameters:
section_title (str) – Name of the section to pull or, if None is provided, the links between the main heading and the first section
- Returns:
List of (title, url) tuples
- Return type:
list
Note
Use None to pull the links from the header section
Note
Returns None if section title is not found
Note
Side effect is to also pull the html which can be slow
Note
This is a parsing operation and not part of the standard API
- section(section_title: str | None) str | None [source]
Plain text section content
- Parameters:
section_title (str) – Name of the section to pull or None for the header section
- Returns:
The content of the section
- Return type:
str
Note
Use None if the header section is desired
Note
Returns None if section title is not found; only text between title and next section or sub-section title is returned
Note
Side effect is to also pull the content which can be slow
Note
This is a parsing operation and not part of the standard API
- summarize(sentences: int = 0, chars: int = 0) str [source]
Summarize page either by number of sentences, chars, or first section (default)
- Parameters:
sentences (int) – Number of sentences to use in summary (first x sentences)
chars (int) – Number of characters to use in summary (first x characters)
- Returns:
The summary of the MediaWiki page
- Return type:
str
Note
Precedence for parameters: sentences then chars; if both are 0 then the entire first section is returned
- property backlinks: List[str]
Pages that link to this page
Note
Not settable
- Type:
list
- property categories: List[str]
Non-hidden categories on the page
Note
Not settable
- Type:
list
- property content: str
The page content in text format
Note
Not settable
Note
Side effect is to also get revision_id and parent_id
- Type:
str
- property coordinates: Tuple[Decimal, Decimal] | None
GeoCoordinates of the place referenced; results in lat/long tuple or None if no geocoordinates present
Note
Not settable
Note
Requires the GeoData extension to be installed
- Type:
Tuple
- property hatnotes: List[str]
Parse hatnotes from the HTML
Note
Not settable
Note
Side effect is to also pull the html which can be slow
Note
This is a parsing operation and not part of the standard API
- Type:
list
- property html: str
HTML representation of the page
Note
Not settable
Warning
This can be slow for very large pages
- Type:
str
- property images: List[str]
Images on the page
Note
Not settable
- Type:
list
- property langlinks: Dict[str, str]
Names of the page in other languages for which page is where the key is the language code and the page name is the name of the page in that language.
Note
Not settable
Note
list of all language links from the provided pages to other languages according to: https://www.mediawiki.org/wiki/API:Langlinks
- Type:
dict
- property links: List[str]
List of all MediaWiki page links on the page
Note
Not settable
- Type:
list
- property logos: List[str]
Parse images within the infobox signifying either the main image or logo
Note
Not settable
Note
Side effect is to also pull the html which can be slow
Note
This is a parsing operation and not part of the standard API
- Type:
list
- property parent_id: int
The parent id of the page
Note
Not settable
Note
Side effect is to also get content and revision_id
- Type:
int
- property preview: Dict[str, str]
Page preview information that builds the preview hover
- Type:
dict
- property redirects: List[str]
List of all redirects to this page; i.e., the titles listed here will redirect to this page title
Note
Not settable
- Type:
list
- property references: List[str]
External links, or references, listed anywhere on the MediaWiki page .. note:: Not settable
- Note
May include external links within page that are not technically cited anywhere
- Type:
list
- property revision_id: int
The current revision id of the page
Note
Not settable
Note
Side effect is to also get content and parent_id
- Type:
int
- property sections: List[str]
Table of contents sections
Note
Not settable
- Type:
list
- property summary: str | None
Default page summary
Note
Not settable
- Type:
str
- property table_of_contents: Dict[str, Any]
Dictionary of sections and sub-sections
Note
Leaf nodes are empty OrderedDict objects
Note
Not Settable
- Type:
OrderedDict
- property wikitext: str
Wikitext representation of the page
Note
Not settable
- Type:
str
Exceptions
MediaWiki Exceptions
- exception mediawiki.exceptions.DisambiguationError(title: str, may_refer_to: List[str], url: str, details: List[Dict] | None = None)[source]
Exception raised when a page resolves to a Disambiguation page
- Parameters:
title (str) – Title that resulted in a disambiguation page
may_refer_to (list) – List of possible titles
url (str) – Full URL to the disambiguation page
details (dict) – A list of dictionaries with more information of possible results
Note
options only includes titles that link to valid MediaWiki pages
- property details: List[Dict] | None
The details of the proposed non-disambigous pages
- Type:
list
- property options: List[str]
The list of possible page titles
- Type:
list
- property title: str
The title of the page
- Type:
str
- property unordered_options: List[str]
The list of possible page titles, un-sorted in an attempt to get them as they showup on the page
- Type:
list
- property url: str
The url, if possible, of the disambiguation page
- Type:
str
- exception mediawiki.exceptions.HTTPTimeoutError(query: str)[source]
Exception raised when a request to the Mediawiki site times out.
- Parameters:
query (str) – The query that timed out
- property query: str
The query that timed out
- Type:
str
- exception mediawiki.exceptions.MediaWikiAPIURLError(api_url: str)[source]
Exception raised when the MediaWiki server does not support the API
- Parameters:
api_url (str) – The API URL that was not recognized
- property api_url: str
The api url that raised the exception
- Type:
str
- exception mediawiki.exceptions.MediaWikiBaseException(message: str)[source]
Base MediaWikiException
- Parameters:
message – The message of the exception
- property message: str
The MediaWiki exception message
- Type:
str
- exception mediawiki.exceptions.MediaWikiCategoryTreeError(category: str)[source]
Exception when the category tree is unable to complete for an unknown reason
- Parameters:
category (str) – The category that threw an exception
- property category: str
The category that threw an exception during category tree generation
- Type:
str
- exception mediawiki.exceptions.MediaWikiException(error: str)[source]
MediaWiki Exception Class
- Parameters:
error (str) – The error message that the MediaWiki site returned
- property error: str
The error message that the MediaWiki site returned
- Type:
str
- exception mediawiki.exceptions.MediaWikiGeoCoordError(error: str)[source]
Exceptions to handle GeoData exceptions
- Parameters:
error (str) – Error message from the MediaWiki site related to GeoCoordinates
- property error: str
The error that was thrown when pulling GeoCoordinates
- Type:
str
- exception mediawiki.exceptions.MediaWikiLoginError(error: str)[source]
Exception raised when unable to login to the MediaWiki site
- Parameters:
error (str) – The error message that the MediaWiki site returned
- property error: str
The error message that the MediaWiki site returned
- Type:
str
- exception mediawiki.exceptions.PageError(title: str | None = None, pageid: int | None = None)[source]
Exception raised when no MediaWiki page matched a query
- Parameters:
title (str) – Title of the page
pageid (int) – MediaWiki page id of the page
- property pageid: int
The page id that caused the page error
- Type:
int
- property title: str
The title that caused the page error
- Type:
str
- exception mediawiki.exceptions.RedirectError(title: str)[source]
Exception raised when a page title unexpectedly resolves to a redirect
- Parameters:
title (str) – Title of the page that redirected
Note
This should only occur if both auto_suggest and redirect are set to False
- property title: str
The title that was redirected
- Type:
str