HTML5 A vocabulary and associated APIs for HTML and XHTML W3C Working Draft 29 March 2012

Abstract

This specification defines the 5th major revision of the core language of the World Wide Web: the Hypertext Markup Language (HTML). In this version, new features are introduced to help Web application authors, new elements are introduced based on research into prevailing authoring practices, and special attention has been given to defining clear conformance criteria for user agents in an effort to improve interoperability. Status of This document

Table of Contents 1 Introduction 1.1 Background 1.2 Audience 1.3 Scope 1.4 History 1.5 Design notes 1.5.1 Serializability of script execution 1.5.2 Compliance with other specifications 1.6 HTML vs XHTML 1.7 Structure of this specification 1.7.1 How to read this specification 1.7.2 Typographic conventions 1.8 A quick introduction to HTML 1.8.1 Writing secure applications with HTML 1.8.2 Common pitfalls to avoid when using the scripting APIs 1.9 Conformance requirements for authors 1.9.1 Presentational markup 1.9.2 Syntax errors 1.9.3 Restrictions on content models and on attribute values 1.10 Recommended reading 2 Common infrastructure 2.1 Terminology 2.1.1 Resources 2.1.2 XML 2.1.3 DOM trees 2.1.4 Scripting 2.1.5 Plugins 2.1.6 Character encodings 2.2 Conformance requirements 2.2.1 Conformance classes 2.2.2 Dependencies 2.2.3 Extensibility 2.3 Case-sensitivity and string comparison 2.4 UTF-8 2.5 Common microsyntaxes 2.5.1 Common parser idioms 2.5.2 Boolean attributes 2.5.3 Keywords and enumerated attributes 2.5.4 Numbers Signed integers Non-negative integers Floating-point numbers Percentages and lengths Lists of integers Lists of dimensions 2.5.5 Dates and times Months Dates Yearless dates Times Local dates and times Time zones Global dates and times Weeks Durations Vaguer moments in time 2.5.6 Colors 2.5.7 Space-separated tokens 2.5.8 Comma-separated tokens 2.5.9 References 2.5.10 Media queries 2.6 URLs 2.6.1 Terminology 2.6.2 Parsing URLs 2.6.3 Resolving URLs 2.6.4 URL manipulation and creation 2.6.5 Dynamic changes to base URLs 2.6.6 Interfaces for URL manipulation 2.7 Fetching resources 2.7.1 Protocol concepts 2.7.2 Encrypted HTTP and related security concerns 2.7.3 Determining the type of a resource 2.7.4 Extracting encodings from meta elements 2.7.5 CORS settings attributes 2.7.6 CORS-enabled fetch 2.8 Common DOM interfaces 2.8.1 Reflecting content attributes in IDL attributes 2.8.2 Collections HTMLAllCollection HTMLFormControlsCollection HTMLOptionsCollection 2.8.3 DOMStringMap 2.8.4 Transferable objects 2.8.5 Safe passing of structured data 2.8.6 DOM feature strings 2.8.7 Garbage collection 2.9 Namespaces 3 Semantics, structure, and APIs of HTML documents 3.1 Documents 3.1.1 Documents in the DOM 3.1.2 Security 3.1.3 Resource metadata management 3.1.4 DOM tree accessors 3.1.5 Loading XML documents 3.2 Elements 3.2.1 Semantics 3.2.2 Elements in the DOM 3.2.3 Global attributes The id attribute The title attribute The lang and xml:lang attributes The translate attribute The xml:base attribute (XML only) The dir attribute The class attribute The style attribute Embedding custom non-visible data with the data-* attributes 3.2.4 Element definitions Attributes 3.2.5 Content models Kinds of content Metadata content Flow content Sectioning content Heading content Phrasing content Embedded content Interactive content Palpable content Transparent content models Paragraphs 3.2.6 Requirements relating to bidirectional-algorithm formatting characters 3.2.7 WAI-ARIA 3.3 Interactions with XPath and XSLT 3.4 Dynamic markup insertion 3.4.1 Opening the input stream 3.4.2 Closing the input stream 3.4.3 document.write() 3.4.4 document.writeln() 4 The elements of HTML 4.1 The root element 4.1.1 The html element 4.2 Document metadata 4.2.1 The head element 4.2.2 The title element 4.2.3 The base element 4.2.4 The link element 4.2.5 The meta element Standard metadata names Other metadata names Pragma directives Other pragma directives Specifying the document's character encoding 4.2.6 The style element 4.2.7 Styling 4.3 Scripting 4.3.1 The script element Scripting languages Restrictions for contents of script elements Inline documentation for external scripts Interaction of script elements and XSLT 4.3.2 The noscript element 4.4 Sections 4.4.1 The body element 4.4.2 The section element 4.4.3 The nav element 4.4.4 The article element 4.4.5 The aside element 4.4.6 The h1, h2, h3, h4, h5, and h6 elements 4.4.7 The hgroup element 4.4.8 The header element 4.4.9 The footer element 4.4.10 The address element 4.4.11 Headings and sections Creating an outline 4.5 Grouping content 4.5.1 The p element 4.5.2 The hr element 4.5.3 The pre element 4.5.4 The blockquote element 4.5.5 The ol element 4.5.6 The ul element 4.5.7 The li element 4.5.8 The dl element 4.5.9 The dt element 4.5.10 The dd element 4.5.11 The figure element 4.5.12 The figcaption element 4.5.13 The div element 4.6 Text-level semantics 4.6.1 The a element 4.6.2 The em element 4.6.3 The strong element 4.6.4 The small element 4.6.5 The s element 4.6.6 The cite element 4.6.7 The q element 4.6.8 The dfn element 4.6.9 The abbr element 4.6.10 The time element 4.6.11 The code element 4.6.12 The var element 4.6.13 The samp element 4.6.14 The kbd element 4.6.15 The sub and sup elements 4.6.16 The i element 4.6.17 The b element 4.6.18 The u element 4.6.19 The mark element 4.6.20 The ruby element 4.6.21 The rt element 4.6.22 The rp element 4.6.23 The bdi element 4.6.24 The bdo element 4.6.25 The span element 4.6.26 The br element 4.6.27 The wbr element 4.6.28 Usage summary 4.7 Edits 4.7.1 The ins element 4.7.2 The del element 4.7.3 Attributes common to ins and del elements 4.7.4 Edits and paragraphs 4.7.5 Edits and lists 4.7.6 Edits and tables 4.8 Embedded content 4.8.1 The img element Requirements for providing text to act as an alternative for images General guidelines A link or button containing nothing but the image A phrase or paragraph with an alternative graphical representation: charts, diagrams, graphs, maps, illustrations A short phrase or label with an alternative graphical representation: icons, logos Text that has been rendered to a graphic for typographical effect A graphical representation of some of the surrounding text A purely decorative image that doesn't add any information A group of images that form a single larger picture with no links A group of images that form a single larger picture with links A key part of the content An image not intended for the user Guidance for markup generators Guidance for conformance checkers 4.8.2 The iframe element 4.8.3 The embed element 4.8.4 The object element 4.8.5 The param element 4.8.6 The video element 4.8.7 The audio element 4.8.8 The source element 4.8.9 The track element 4.8.10 Media elements Error codes Location of the media resource MIME types Network states Loading the media resource Offsets into the media resource Ready states Playing the media resource Seeking Media resources with multiple media tracks AudioTrackList and VideoTrackList objects Selecting specific audio and video tracks declaratively Synchronising multiple media elements Introduction Media controllers Assigning a media controller declaratively Timed text tracks Text track model Sourcing in-band text tracks Sourcing out-of-band text tracks Guidelines for exposing cues in various formats as text track cues Text track API Text tracks describing chapters Event definitions User interface Time ranges Event definitions Event summary Security and privacy considerations Best practices for authors using media elements Best practices for implementors of media elements 4.8.11 The canvas element Color spaces and color correction Security with canvas elements 4.8.12 The map element 4.8.13 The area element 4.8.14 Image maps Authoring Processing model 4.8.15 MathML 4.8.16 SVG 4.8.17 Dimension attributes 4.9 Tabular data 4.9.1 The table element Techniques for describing tables Techniques for table layout 4.9.2 The caption element 4.9.3 The colgroup element 4.9.4 The col element 4.9.5 The tbody element 4.9.6 The thead element 4.9.7 The tfoot element 4.9.8 The tr element 4.9.9 The td element 4.9.10 The th element 4.9.11 Attributes common to td and th elements 4.9.12 Processing model Forming a table Forming relationships between data cells and header cells 4.9.13 Examples 4.10 Forms 4.10.1 Introduction Writing a form's user interface Implementing the server-side processing for a form Configuring a form to communicate with a server Client-side form validation Date, time, and number formats 4.10.2 Categories 4.10.3 The form element 4.10.4 The fieldset element 4.10.5 The legend element 4.10.6 The label element 4.10.7 The input element States of the type attribute Hidden state (type=hidden) Text (type=text) state and Search state (type=search) Telephone state (type=tel) URL state (type=url) E-mail state (type=email) Password state (type=password) Date and Time state (type=datetime) Date state (type=date) Month state (type=month) Week state (type=week) Time state (type=time) Local Date and Time state (type=datetime-local) Number state (type=number) Range state (type=range) Color state (type=color) Checkbox state (type=checkbox) Radio Button state (type=radio) File Upload state (type=file) Submit Button state (type=submit) Image Button state (type=image) Reset Button state (type=reset) Button state (type=button) Implemention notes regarding localization of form controls Common input element attributes The autocomplete attribute The dirname attribute The list attribute The readonly attribute The size attribute The required attribute The multiple attribute The maxlength attribute The pattern attribute The min and max attributes The step attribute The placeholder attribute Common input element APIs Common event behaviors 4.10.8 The button element 4.10.9 The select element 4.10.10 The datalist element 4.10.11 The optgroup element 4.10.12 The option element 4.10.13 The textarea element 4.10.14 The keygen element 4.10.15 The output element 4.10.16 The progress element 4.10.17 The meter element 4.10.18 Association of controls and forms 4.10.19 Attributes common to form controls Naming form controls Enabling and disabling form controls A form control's value Autofocusing a form control Limiting user input length Form submission Submitting element directionality 4.10.20 APIs for the text field selections 4.10.21 Constraints Definitions Constraint validation The constraint validation API Security 4.10.22 Form submission Introduction Implicit submission Form submission algorithm Constructing the form data set URL-encoded form data Multipart form data Plain text form data 4.10.23 Resetting a form 4.11 Interactive elements 4.11.1 The details element 4.11.2 The summary element 4.11.3 The command element 4.11.4 The menu element Introduction Building menus and toolbars Context menus Toolbars 4.11.5 Commands Using the a element to define a command Using the button element to define a command Using the input element to define a command Using the option element to define a command Using the command element to define a command Using the command attribute on command elements to define a command indirectly Using the accesskey attribute on a label element to define a command Using the accesskey attribute on a legend element to define a command Using the accesskey attribute to define a command on other elements 4.11.6 The dialog element Anchor points 4.12 Links 4.12.1 Introduction 4.12.2 Links created by a and area elements 4.12.3 Following hyperlinks 4.12.4 Link types Link type "alternate" Link type "author" Link type "bookmark" Link type "help" Link type "icon" Link type "license" Link type "nofollow" Link type "noreferrer" Link type "prefetch" Link type "search" Link type "stylesheet" Link type "tag" Sequential link types Link type "next" Link type "prev" Other link types 4.13 Common idioms without dedicated elements 4.13.1 The main part of the content 4.13.2 Bread crumb navigation 4.13.3 Tag clouds 4.13.4 Conversations 4.13.5 Footnotes 4.14 Matching HTML elements using selectors 4.14.1 Case-sensitivity 4.14.2 Pseudo-classes 5 Loading Web pages 5.1 Browsing contexts 5.1.1 Nested browsing contexts Navigating nested browsing contexts in the DOM 5.1.2 Auxiliary browsing contexts Navigating auxiliary browsing contexts in the DOM 5.1.3 Secondary browsing contexts 5.1.4 Security 5.1.5 Groupings of browsing contexts 5.1.6 Browsing context names 5.2 The Window object 5.2.1 Security 5.2.2 APIs for creating and navigating browsing contexts by name 5.2.3 Accessing other browsing contexts 5.2.4 Named access on the Window object 5.2.5 Garbage collection and browsing contexts 5.2.6 Browser interface elements 5.2.7 The WindowProxy object 5.3 Origin 5.3.1 Relaxing the same-origin restriction 5.4 Sandboxing 5.5 Session history and navigation 5.5.1 The session history of browsing contexts 5.5.2 The History interface 5.5.3 The Location interface Security 5.5.4 Implementation notes for session history 5.6 Browsing the Web 5.6.1 Navigating across documents 5.6.2 Page load processing model for HTML files 5.6.3 Page load processing model for XML files 5.6.4 Page load processing model for text files 5.6.5 Page load processing model for multipart/x-mixed-replace resources 5.6.6 Page load processing model for media 5.6.7 Page load processing model for content that uses plugins 5.6.8 Page load processing model for inline content that doesn't have a DOM 5.6.9 Navigating to a fragment identifier 5.6.10 History traversal Event definitions 5.6.11 Unloading documents Event definition 5.6.12 Aborting a document load 5.7 Offline Web applications 5.7.1 Introduction Event summary 5.7.2 Application caches 5.7.3 The cache manifest syntax Some sample manifests Writing cache manifests Parsing cache manifests 5.7.4 Downloading or updating an application cache 5.7.5 The application cache selection algorithm 5.7.6 Changes to the networking model 5.7.7 Expiring application caches 5.7.8 Disk space 5.7.9 Application cache API 5.7.10 Browser state 6 Web application APIs 6.1 Scripting 6.1.1 Introduction 6.1.2 Enabling and disabling scripting 6.1.3 Processing model Definitions Calling scripts Creating scripts Killing scripts Runtime script errors Runtime script errors in documents 6.1.4 Event loops Definitions Processing model Generic task sources 6.1.5 The javascript: URL scheme 6.1.6 Events Event handlers Event handlers on elements, Document objects, and Window objects Event firing Events and the Window object 6.2 Base64 utility methods 6.3 Timers 6.4 User prompts 6.4.1 Simple dialogs 6.4.2 Printing 6.4.3 Dialogs implemented using separate documents 6.5 System state and capabilities 6.5.1 The Navigator object Client identification Custom scheme and content handlers Security and privacy Sample user interface Manually releasing the storage mutex 6.5.2 The External interface 7 User interaction 7.1 The hidden attribute 7.2 Inert subtrees 7.3 Activation 7.4 Focus 7.4.1 Sequential focus navigation and the tabindex attribute 7.4.2 Focus management 7.4.3 Document-level focus APIs 7.4.4 Element-level focus APIs 7.5 Assigning keyboard shortcuts 7.5.1 Introduction 7.5.2 The accesskey attribute 7.5.3 Processing model 7.6 Editing 7.6.1 Making document regions editable: The contenteditable content attribute 7.6.2 Making entire documents editable: The designMode IDL attribute 7.6.3 Best practices for in-page editors 7.6.4 Editing APIs 7.6.5 Spelling and grammar checking 7.7 Drag and drop 7.7.1 Introduction 7.7.2 The drag data store 7.7.3 The DataTransfer interface The DataTransferItemList interface The DataTransferItem interface 7.7.4 The DragEvent interface 7.7.5 Drag-and-drop processing model 7.7.6 Events summary 7.7.7 The draggable attribute 7.7.8 The dropzone attribute 7.7.9 Security risks in the drag-and-drop model 8 The HTML syntax 8.1 Writing HTML documents 8.1.1 The DOCTYPE 8.1.2 Elements Start tags End tags Attributes Optional tags Restrictions on content models Restrictions on the contents of raw text and RCDATA elements 8.1.3 Text Newlines 8.1.4 Character references 8.1.5 CDATA sections 8.1.6 Comments 8.2 Parsing HTML documents 8.2.1 Overview of the parsing model 8.2.2 The input byte stream Determining the character encoding Character encodings Changing the encoding while parsing Preprocessing the input stream 8.2.3 Parse state The insertion mode The stack of open elements The list of active formatting elements The element pointers Other parsing state flags 8.2.4 Tokenization Data state Character reference in data state RCDATA state Character reference in RCDATA state RAWTEXT state Script data state PLAINTEXT state Tag open state End tag open state Tag name state RCDATA less-than sign state RCDATA end tag open state RCDATA end tag name state RAWTEXT less-than sign state RAWTEXT end tag open state RAWTEXT end tag name state Script data less-than sign state Script data end tag open state Script data end tag name state Script data escape start state Script data escape start dash state Script data escaped state Script data escaped dash state Script data escaped dash dash state Script data escaped less-than sign state Script data escaped end tag open state Script data escaped end tag name state Script data double escape start state Script data double escaped state Script data double escaped dash state Script data double escaped dash dash state Script data double escaped less-than sign state Script data double escape end state Before attribute name state Attribute name state After attribute name state Before attribute value state Attribute value (double-quoted) state Attribute value (single-quoted) state Attribute value (unquoted) state Character reference in attribute value state After attribute value (quoted) state Self-closing start tag state Bogus comment state Markup declaration open state Comment start state Comment start dash state Comment state Comment end dash state Comment end state Comment end bang state DOCTYPE state Before DOCTYPE name state DOCTYPE name state After DOCTYPE name state After DOCTYPE public keyword state Before DOCTYPE public identifier state DOCTYPE public identifier (double-quoted) state DOCTYPE public identifier (single-quoted) state After DOCTYPE public identifier state Between DOCTYPE public and system identifiers state After DOCTYPE system keyword state Before DOCTYPE system identifier state DOCTYPE system identifier (double-quoted) state DOCTYPE system identifier (single-quoted) state After DOCTYPE system identifier state Bogus DOCTYPE state CDATA section state Tokenizing character references 8.2.5 Tree construction Creating and inserting elements Closing elements that have implied end tags Foster parenting The rules for parsing tokens in HTML content The "initial" insertion mode The "before html" insertion mode The "before head" insertion mode The "in head" insertion mode The "in head noscript" insertion mode The "after head" insertion mode The "in body" insertion mode The "text" insertion mode The "in table" insertion mode The "in table text" insertion mode The "in caption" insertion mode The "in column group" insertion mode The "in table body" insertion mode The "in row" insertion mode The "in cell" insertion mode The "in select" insertion mode The "in select in table" insertion mode The "after body" insertion mode The "in frameset" insertion mode The "after frameset" insertion mode The "after after body" insertion mode The "after after frameset" insertion mode The rules for parsing tokens in foreign content 8.2.6 The end 8.2.7 Coercing an HTML DOM into an infoset 8.2.8 An introduction to error handling and strange cases in the parser Misnested tags: </b> Misnested tags: <b>

</b> Unexpected markup in tables Scripts that modify the page as it is being parsed The execution of scripts that are moving across multiple documents Unclosed formatting elements 8.3 Serializing HTML fragments 8.4 Parsing HTML fragments 8.5 Named character references 9 The XHTML syntax 9.1 Writing XHTML documents 9.2 Parsing XHTML documents 9.3 Serializing XHTML fragments 9.4 Parsing XHTML fragments 10 Rendering 10.1 Introduction 10.2 The CSS user agent style sheet and presentational hints 10.3 Non-replaced elements 10.3.1 Hidden elements 10.3.2 The page 10.3.3 Flow content 10.3.4 Phrasing content 10.3.5 Bidirectional text 10.3.6 Quotes 10.3.7 Sections and headings 10.3.8 Lists 10.3.9 Tables 10.3.10 Form controls 10.3.11 The hr element 10.3.12 The fieldset element 10.4 Replaced elements 10.4.1 Embedded content 10.4.2 Images 10.4.3 Attributes for embedded content and images 10.4.4 Image maps 10.4.5 Toolbars 10.5 Bindings 10.5.1 Introduction 10.5.2 The button element 10.5.3 The details element 10.5.4 The input element as a text entry widget 10.5.5 The input element as domain-specific widgets 10.5.6 The input element as a range control 10.5.7 The input element as a color well 10.5.8 The input element as a checkbox and radio button widgets 10.5.9 The input element as a file upload control 10.5.10 The input element as a button 10.5.11 The marquee element 10.5.12 The meter element 10.5.13 The progress element 10.5.14 The select element 10.5.15 The textarea element 10.5.16 The keygen element 10.6 Frames and framesets 10.7 Interactive media 10.7.1 Links, forms, and navigation 10.7.2 The title attribute 10.7.3 Editing hosts 10.7.4 Text rendered in native user interfaces 10.8 Print media 11 Obsolete features 11.1 Obsolete but conforming features 11.1.1 Warnings for obsolete but conforming features 11.2 Non-conforming features 11.3 Requirements for implementations 11.3.1 The applet element 11.3.2 The marquee element 11.3.3 Frames 11.3.4 Other elements, attributes and APIs 12 IANA considerations 12.1 text/html 12.2 multipart/x-mixed-replace 12.3 application/xhtml+xml 12.4 application/x-www-form-urlencoded 12.5 text/cache-manifest 12.6 web+ scheme prefix Index Elements Element content categories Attributes Interfaces Events References Acknowledgements

