MediaWiki internals
|
References
- Manual:MediaWiki hooks
- MediaWiki hooks List of hooks in current version from source
- Layout architecture of MediaWiki pages
- Extension Hooks Registry
Parser
The parser converts a string $text from wikitext to HTML.
- "The MediaWiki codebase is large and ugly... One of the best ways to learn about MediaWiki is to read the code..."
-- How to become a MediaWiki hacker
Actually, that's about the only way to learn about it, since documentation is scattered and often out of date (as will be the comments below, all too soon). Some references:
- Manual: Parser.php
- Manual: Code - Parser.php
- Extensions FAQ - How do I render wikitext in my extension?
| (According to the comment at the start of Parser.php, version 1.10) | |
| Entry points into the Parser class | |
|---|---|
| parse() | produces HTML output |
| preSaveTransform() | produces altered wiki markup. |
| transformMsg() | performs brace substitution on MediaWiki messages |
| preprocess() | removes HTML comments and expands templates |
| Globals | |
| Objects used | $wgLang, $wgContLang |
| Not used | $wgArticle, $wgUser or $wgTitle. Keep them away! |
| Settings | |
|
$wgUseTex*, $wgUseDynamicDates*, $wgInterwikiMagic*, | |
parser->parse()
The main entry point, function parse, performs the parsing in several stages:
- 1. strip() strips and renders nowiki, pre, math, and hiero
- Text between nowiki tags is replaced by a temporary code which at the end is replaced by the original text in an unstrip operation. After that XML-style tags of extensions are processed.
- preceded by the ParserBeforeStrip hook
- followed by the ParserAfterStrip hook
- 2. internalParse() converts the text from wikitext to HTML
- It expands variables and templates, replaces wiki markup such as header codes and double and triple quotation marks with their HTML equivalents, converts double-bracketed phrases into A HREF links to internal wiki pages, and converts single-bracketed URLs into A HREF links to external pages.
- preceded by the ParserBeforeInternalParse hook
- Function replaceVariables(): recursively expands variables, templates, and template parameters. This calls:
- Function replace_callback: parses the wikitext with respect to pairs of double and triple braces.
- Function braceSubstitution: expands variables and templates.
- If there are no pipes it calls function variableSubstitution.
- If the text between the braces is the name of a variable it calls function getVariableValue.
- If we have, after the opening braces, the title of a core parser function and then a colon, the function concerned in CoreParserFunctions.php is called.
- Function createAssocArgs: converts an array of parameter definitions such "a=3", "5", "a=4", and "8" into an associative array with the parameter names a, 1, and 2 as indexes and parameter values 4, 5, and 8 as array values.
- If there are no pipes it calls function variableSubstitution.
- Function argSubstitution: expands template parameters.
- Function doHeadings: replaces header codes: ==a== becomes <h2>a</h2> etc.
- Function reformat in DateFormatter.php converts dates and times according to preferences
- Function doAllQuotes: replaces double and triple quotation marks by <i> and <b>.
- Function replaceInternalLinks: converts the internal links to HTML.
- Function replaceExternalLinks: converts the external links to HTML.
- 3. tidy() does some HTML cleanup
-
- preceded by the ParserBeforeTidy hook
- followed by the ParserAfterTidy hook
Further the file contains function setFunctionHook, called when setting up a function, and associating the magic_word_id of a parser function with the name of the PHP function defining it (see also parser function extensions). This information is stored in array mFunctionHooks. Also the array mFunctionSynonyms is created, with a case-sensitivy boolean as index, and as value an array with the magic words that can be used in the wikitext as indexes, and the magic word id's as values. These arrays are used during the parsing.
parser->transformMsg()
According to the source code, this function transforms a MediaWiki message by replacing magic variables. It does not transform templates.
parser->preprocess()
Expands templates and variables in the text, producing valid, static wikitext. Also removes comments.
- strip
- replaceVariables. Uses output type OT_PREPROCESS (calls braceSubstitution, argSubstitution)
- unstrip
- Used for Special:ExpandTemplates
parser->recursiveTagParse()
According to the source code, this is a recursive parser entry point that can be called from an extension tag hook.
- calls parser->strip() then parser->internalParse()
parser->internalParse()
According to the source code, this is a helper function for parse() that transforms wiki markup into HTML. Only called for $mOutputType == OT_HTML.
- calls replaceVariables, links, headings
- still doesn't process #if: ? which is supposed to be processed in replaceVariables >> replace_callback
parser->replaceVariables()
According to the source code, this function replaces magic variables, templates, and template arguments with the appropriate text. Templates are substituted recursively, taking care to avoid infinite loops. Note that the substitution depends on value of $mOutputType:
- OT_WIKI: only {{subst:}} templates
- OT_MSG: only magic variables
- OT_HTML: all templates and magic variables
Function hooks are processed in subroutine braceSubstitution() which is, again, called in replaceVariables
Parser globals
According to the source (version 1.10),
- Globals used: $wgLang, $wgContLang
- NOT $wgArticle, $wgUser or $wgTitle. Keep them away!
| Function | Language | User | Title/namespace | Other |
|---|---|---|---|---|
| firstCallInit() | $wgAllowDisplayTitle | $wgAllowSlowParserFunctions | ||
| parse() | $wgContLang | $wgUseTidy, $wgAlwaysUseTidy | ||
| getFunctionLang() | $wgLang, $wgContLang | |||
| strip() | $wgContLang | $wgRawHtml | ||
| tidy() | $wgTidyInternal | |||
| externalTidy() | $wgTidyConf, $wgTidyBin, $wgTidyOpts | |||
| internalTidy() | $wgTidyConf | |||
| replaceExternalLinks() | $wgContLang | |||
| replaceFreeExternalLinks() | $wgContLang | |||
| replaceInternalLinks() | $wgContLang | $wgRestrictedNamespaces | $wgLinkWarn | |
| areSubpagesAllowed() | $wgNamespacesWithSubpages | |||
| getVariableValue() | $wgContLang | $wgSitename, $wgServer, $wgServerName, $wgScriptPath, $wgLocaltimezone, $wgContLanguageCode | ||
| variableSubstitution() | $wgContLang | |||
| braceSubstitution() | $wgLang, $wgContLang | $wgAllowDisplayTitle, $wgNonincludableNamespaces, $wgRestrictedNamespaces | ||
| fetchTemplate() | $wgLang | |||
| interwikiTransclude() | $wgEnableScaryTranscluding | |||
| fetchScaryTemplateMaybeFromCache() | $wgTranscludeCacheExpiry | |||
| formatHeadings() | $wgContLang | $wgMaxTocLevel | ||
| pstPass2() | $wgContLang | $wgLegalTitleChars | $wgLocaltimezone | |
| cleanSig() | $wgTitle | |||
| transformMsg() | $wgTitle | |||
| replaceLinkHolders() | $wgContLang | $wgUser | ||
| getRevisionTimestamp() | $wgContLang |
Parser, unique prefix, \x07UNIQ, and other mysteries
$mUniqPrefix: Cleared out with clearState().
function clearState() { --- (snippage) --- /** * Prefix for temporary replacement strings for the multipass parser. * \x07 should never appear in input as it's disallowed in XML. * Using it at the front also gives us a little extra robustness * since it shouldn't match when butted up against identifier-like * string constructs. */ $this->mUniqPrefix = "\x07UNIQ" . Parser::getRandomString(); /** * Accessor for mUniqPrefix. * * @public */ function uniqPrefix() { return $this->mUniqPrefix; }
The clearState function is only called from with Parser.php, but it's called by all sorts of different routines. Basically, you have to assume the state gets cleared by any external parser call. Buy why isn't it saved with clone?
- When an object is cloned, PHP 5 will perform a shallow copy of all of the object's properties. Any properties that are references to other variables, will remain references. -- PHP Manual: Object cloning
That's why. But there's a way around this:
- If a __clone() method is defined, then the newly created object's __clone() method will be called, to allow any necessary properties that need to be changed.
Passing extra parameters via URL
To be updated: - The WebRequest class is used to obtain information from the GET and POST arrays. Using this is recommended over directly accessing the superglobals, since the object does fun stuff like magic_quotes cleaning. See WebRequest $wgRequest
- WebRequest->getVal()
- WebRequest->get*()
- WebRequest->wasPosted()
The PHP global $_GET contains an associative array of variables passed to the current script via the HTTP GET method. This is a 'superglobal', or automatic global, variable.
For example, if the URL for a page is
http://www.myhost.com/wiki/index.php?title=MediaWiki_internals&options=test
the contents of $_GET are:
array(2) { ["title"]=> string(20) "MediaWiki_internals" ["options"]=> string(4) "test" }
This is normally of little use, but the code for a special page can make use of it by testing for $_GET['options']
Skins
MediaWiki outputs the namespace as part of the body's class. For example,
<body class="bodySection ns-0 ltr"><div id="globalWrapper"> <body class="bodySection ns-100 ltr"><div id="globalWrapper">
This means that page styles can be easily modified depending on the namespace. For example, in MediaWikiCommon.css:
/***** BACKGROUND COLORS FOR NAMESPACES *****/ /* Colour of pseudo NS Special (light grey) */ .ns--2 #content, .ns--2 #p-cactions li, .ns--2 #p-cactions li a { background: #f4f4f4; } .ns--2 div.thumb { border-color: #f4f4f4; } /* Colour of NS Project + Project_talk (light sky blue) */ .ns-4 #content, .ns-4 #p-cactions li, .ns-4 #p-cactions li a { background: #f8fcff; } .ns-4 div.thumb { border-color: #f8fcff; } .ns-5 #content, .ns-5 #p-cactions li, .ns-5 #p-cactions li a { background: #f8fcff; } .ns-5 div.thumb { border-color: #f8fcff; } /* Colour of NS MediaWiki + MediaWiki_talk (light grey) */ .ns-8 #content, .ns-8 #p-cactions li, .ns-8 #p-cactions li a { background: #f4f4f4; } .ns-8 div.thumb { border-color: #f4f4f4; } .ns-9 #content, .ns-9 #p-cactions li, .ns-9 #p-cactions li a { background: #f4f4f4; } .ns-9 div.thumb { border-color: #f4f4f4; } /* Blue border for Public Domain namespaces. This is currently NS Help (but NOT Help_talk) */ .ns-12 #content { border: 2px solid #0000CC; border-right: none; background-image: url(http://upload.wikimedia.org/wikipedia/mediawiki/b/b8/PD-banner.png); background-repeat: no-repeat; background-position: right top; } .ns-12 #bodyContent { background-image: url(http://upload.wikimedia.org/wikipedia/mediawiki/6/67/PD-icon-faded.png); background-repeat: no-repeat; background-position: right 5em; } /* Colour of NS Manual + Manual_talk (light bluish violet) */ .ns-100 #content, .ns-100 #p-cactions li, .ns-100 #p-cactions li a { background: #f3f3ff; } .ns-100 div.thumb { border-color: #f3f3ff; } .ns-101 #content, .ns-101 #p-cactions li, .ns-101 #p-cactions li a { background: #f3f3ff; } .ns-101 div.thumb { border-color: #f3f3ff; }
Sidebar
The sidebar is used to build the navigation menu, normally at the left of the content, but used for the tab bar at the top of each page in this wiki.
The default sidebar contents are stored in the system message file, but if there's a MediaWiki:Sidebar page, then the contents of that are used instead. The contents of any templates are expanded before the sidebar is generated.
Source code (MediaWiki version 1.10)
Skin.php 1606: function buildSidebar() {
Skin.php 1627: $lines = explode( "\n", wfMsgForContent( 'sidebar' ) );
GlobalFunctions.php 360: function wfMsgForContent( $key ) {
GlobalFunctions.php 368: return wfMsgReal( $key, $args, true, $forcontent );
GlobalFunctions.php 418: function wfMsgReal( $key, $args, $useDB = true, $forContent=false, $transform = true ) {
GlobalFunctions.php 421: $message = wfMsgGetKey( $key, $useDB, $forContent, $transform );
GlobalFunctions.php 454: function wfMsgGetKey( $key, $useDB, $forContent = false, $transform = true ) {
GlobalFunctions.php 481: if ( $transform && strstr( $message, '{{' ) !== false ) {
GlobalFunctions.php 482: $message = $wgParser->transformMsg($message, $wgMessageCache->getParserOptions() );
Transform a MediaWiki message by replacing magic variables (and templates)
Parser.php 3854: function transformMsg( $text, $options ) {