MHTML: Difference between revisions

From HandWiki
imported>Steve2012
over-write
 
fix
 
Line 1: Line 1:
{{short description|Web page archive (file) format}}
{{Infobox file format
{{Infobox file format
| name = MHTML
| name = MHTML
Line 5: Line 4:
| screenshot =
| screenshot =
| extension = .mht, .mhtml
| extension = .mht, .mhtml
| mime = multipart/related application/x-mimearchive
| mime = {{ubl|class=nowrap|multipart/related|application/x-mimearchive|message/rfc822}}
| type code =
| type code =
| uniform type =
| uniform type =
Line 17: Line 16:
}}
}}


'''MHTML''', an initialism of "MIME encapsulation of aggregate [[HTML]] documents", is a [[Web archive file]] format used to combine, in a single [[Computer file|computer file]], the HTML code and its companion resources (such as images) that are represented by external [[Hyperlink|hyperlink]]s in the web page's HTML code. The content of an MHTML file is encoded using the same techniques that were first developed for [[HTML email]] messages, using the MIME content type <code>multipart/related</code>.<ref>{{cite web|first=Amanda |last=Holden|title=Difference of HTML & MHTML|url=https://www.techwalla.com/articles/difference-of-html-mhtml|access-date=17 November 2017|archive-url=https://web.archive.org/web/20171117122700/https://www.techwalla.com/articles/difference-of-html-mhtml|archive-date=17 November 2017|url-status=dead}}</ref> MHTML files use an '''.mhtml''' or '''.mht''' [[Filename extension|filename extension]].
'''MHTML''', an initialism of "MIME encapsulation of aggregate [[HTML]] documents", is a [[List of web archiving file formats|web archiving file format]] used to combine, in a single [[Computer file|computer file]], the HTML code and its companion resources (such as images) that are represented by external [[Hyperlink|hyperlink]]s in the web page's HTML code.<ref>{{Cite web |date=9 November 2009 |title=Key Concepts of MHTML |url=https://learn.microsoft.com/nb-no/previous-versions/office/developer/exchange-server-2010/aa563378(v=exchg.140) |url-status=live |archive-url=http://web.archive.org/web/20260221072103/https://learn.microsoft.com/nb-no/previous-versions/office/developer/exchange-server-2010/aa563378(v=exchg.140) |archive-date=21 February 2026 |access-date=21 February 2026 |website=Microsoft}}</ref> The content of an MHTML file is encoded using the same techniques that were first developed for [[HTML email]] messages, using the [[Media type|MIME content type]] <code>multipart/related</code>.<ref>{{cite web |first=Amanda |last=Holden |title=Difference of HTML & MHTML |url=https://www.techwalla.com/articles/difference-of-html-mhtml |access-date=17 November 2017 |archive-url=https://web.archive.org/web/20171117122700/https://www.techwalla.com/articles/difference-of-html-mhtml |archive-date=17 November 2017 |url-status=dead}}</ref> MHTML files use an <code>.mhtml</code> or <code>.mht</code> [[Filename extension|filename extension]].


The first part of the file is an e-mail header. The second part is normally HTML code. Subsequent parts are additional resources identified by their original [[URL|uniform resource locator]]s (URLs) and encoded in [[Base64|base64]] [[Binary-to-text encoding|binary-to-text encoding]]. MHTML was proposed as an open standard, then circulated in a revised edition in 1999 as RFC 2557.
The first part of the file is an e-mail header. The second part is normally HTML code. Subsequent parts are additional resources identified by their original [[URL|uniform resource locator]]s (URLs) and encoded in [[Base64]] [[Binary-to-text encoding|binary-to-text encoding]]. MHTML was proposed as an [[Engineering:Open standard#File formats|open standard]], then circulated in a revised edition in 1999 as RFC 2557.


The .mhtml (Web archive) and [[Email#Filename extensions|.eml]] (email) filename extensions are interchangeable: either filename extension can be changed from one to the other. An .eml message can be sent by e-mail, and it can be displayed by an [[Software:Email client|email client]]. An email message can be saved using a .mhtml or .mht filename extension and then opened for display in a [[Software:Web browser|web browser]] or for editing other programs, including [[Software:Word processor|word processor]]s and [[Software:Text editor|text editor]]s.
The <code>.mhtml</code> and <code>[[Email#Filename extensions|.eml]]</code> filename extensions are interchangeable: either filename extension can be changed from one to the other. An .eml message can be sent by e-mail, and it can be displayed by an [[Software:Email client|email client]]. An email message can be saved using a <code>.mhtml</code> or <code>.mht</code> filename extension and then opened for display in a [[Software:Web browser|web browser]] or for editing other programs, including [[Software:Word processor|word processor]]s and [[Software:Text editor|text editor]]s.


== Layout ==
== Layout ==
The header of an MHTML file contains metadata such as a date and time stamp, page title, the source URL, and a unique randomized boundary string for separating resources contained within the file. The boundary string is defined at the beginning and used throughout the file.
The [[Header (computing)|header]] of an MHTML file contains [[Metadata#File metadata|metadata]] such as a date and time stamp, page title, the source URL, and a unique randomized [[Delimiter#Content boundary|boundary string]] for separating resources contained within the file. The boundary string is defined at the beginning and used throughout the file.


<source lang="email">
<syntaxhighlight lang="email">
From: <Saved by Blink>
From: <Saved by Blink>
Snapshot-Content-Location: https://en.wikipedia.org/wiki/Smartphone
Snapshot-Content-Location: https://en.wikipedia.org/wiki/Smartphone
Line 35: Line 34:
         type="text/html";
         type="text/html";
         boundary="----MultipartBoundary--GsIBda0vjy2AKIAIliwl7JMwezXDRjDAsLje9khd5l----"
         boundary="----MultipartBoundary--GsIBda0vjy2AKIAIliwl7JMwezXDRjDAsLje9khd5l----"
</source>
</syntaxhighlight>


Then, the page resources are contained sequentially, starting with the page's rendered HTML source code. Each resource has its own metadata header which specifies its MIME type and the original location.
Then, the page resources are contained sequentially, starting with the page's rendered HTML source code. Each resource has its own metadata header which specifies its MIME type and the original location.


<source lang="email">
<syntaxhighlight lang="email">
------MultipartBoundary--GsIBda0vjy2AKIAIliwl7JMwezXDRjDAsLje9khd5l----
------MultipartBoundary--GsIBda0vjy2AKIAIliwl7JMwezXDRjDAsLje9khd5l----
Content-Type: text/html
Content-Type: text/html
Line 47: Line 46:


<!DOCTYPE html>
<!DOCTYPE html>
</source>
</syntaxhighlight>


The MHTML file ends with a boundary string that is not followed by any data.
The MHTML file ends with a boundary string that is not followed by any data.<ref>{{cite web |title=2. The MHTML File Format - Hunchly Knowledge Base |url=https://support.hunch.ly/article/51-1-the-mhtml-file-format |website=support.hunch.ly |access-date=24 September 2022 |date=October 17, 2018}}</ref>


<ref>{{cite web |title=2. The MHTML File Format - Hunchly Knowledge Base |url=https://support.hunch.ly/article/51-1-the-mhtml-file-format |website=support.hunch.ly |access-date=24 September 2022 |date=October 17, 2018}}</ref>
== MIME type ==
 
==Browser support==
 
Some browsers support the MHTML format, either directly or through third-party extensions, but the process for saving a web page along with its resources as an MHTML file is not standardized. Due to this, a web page saved as an MHTML file using one browser may render differently on another.
 
===Internet Explorer===
As of version 5.0, [[Software:Internet Explorer|IE]] was the first browser to support reading and saving web pages and external resources to a single MHTML file.
 
===Microsoft Edge===
As of [[Software:Microsoft Edge#Anaheim (2019–present)|switching to the Chromium source code]], Edge supports saving as MHTML.
 
===Opera===
Support for saving web pages as MHTML files was made available in the [[Software:Opera (web browser)|Opera]] 9.0 web browser.<ref>{{cite web|url=http://my.opera.com/desktopteam/blog/show.dml/172375 |title=…and one more weekly! |last=Santambrogio |first=Claudio |date=10 March 2006 |publisher=Opera Software |access-date=2009-05-15 |archive-url=https://web.archive.org/web/20100115001636/http://my.opera.com/desktopteam/blog/show.dml/172375 |archive-date=15 January 2010 |url-status=dead }}</ref> From Opera 9.50 through the rest of the Presto-based Opera product line (currently at Opera 12.16 as of 19 July 2013), the default format for saving pages is MHTML. The initial release of the new Webkit/Blink-based Opera (Opera 15) did not support MHTML, but subsequent releases (Opera 16 onwards) do.
 
MHTML can be enabled by typing "opera://flags#save-page-as-mhtml" at the address bar.
 
=== Google Chrome ===
Creating MHTML files in Google Chrome is enabled by default in version 86.
 
=== Yandex Browser ===
Creating MHTML (multipart/related) files in Yandex Browser is enabled by default in version 22.7.4.960 (July 2022).
 
=== Vivaldi ===
Similarly to Google Chrome, the [[Software:Chromium (web browser)|Chromium]]-based [[Software:Vivaldi (web browser)|Vivaldi browser]] can save webpages as MHTML files since the 2.3 release.<ref>{{Cite web|url=https://vivaldi.com/fr/blog/auto-stacking-tabs/|title=Vivaldi Update {{!}} Auto-Stacking Tabs|last1=février 6|first1=Publié sur|last2=Tetzchner|first2=2019-Par Jon von|date=2019-02-06|website=Vivaldi|language=fr|access-date=2019-05-16}}</ref>
 
It supports both reading and writing MHTML files by toggling the "vivaldi://flags/#save-page-as-mhtml" option.
 
=== Firefox ===
[[Software:Mozilla Firefox|Mozilla Firefox]] does not support MHTML.<ref>{{cite web|title=Bug 40873 - Save as rfc 2557 MHTML; complete webpage in one file|url=https://bugzilla.mozilla.org/show_bug.cgi?id=40873}}</ref> Until the advent of [[Software:Firefox version history#Firefox 52 through 59|version 57 ("Firefox Quantum")]], MHT files could be read and written by installing a [[Software:Browser extension|browser extension]], such as Mozilla Archive Format or UnMHT.
 
=== Safari ===
From version 3.1.1 onwards, [[Company:Apple Inc.|Apple Inc.]]'s [[Software:Safari (web browser)|Safari]] web browser does not natively support the MHTML format.  Instead, Safari supports the [[Webarchive|webarchive]] format, and the [[Software:MacOS|macOS]] version includes a print-to-[[PDF]] feature.
 
As with most other modern web browsers, support for MHTML files can be added to Safari via various third-party extensions.
 
=== Konqueror ===
As of version 3.5.7, [[Organization:KDE|KDE]]'s [[Software:Konqueror|Konqueror]] web browser does not support MHTML files. An extension project, mhtconv, can be used to allow saving and viewing of MHTML files.
 
===ACCESS NetFront===
[[Software:NetFront|NetFront]] 3.4 (on devices such as the Sony Ericsson K850) can view and save MHTML files.
 
===Pale Moon===
[[Software:Pale Moon (web browser)|Pale Moon]] requires an extension to be installed to read and write MHT files.  One extension is freely available, MozArchiver, a fork of Mozilla Archive Format extension.
 
=== GNOME Web ===
[[Software:GNOME Web|GNOME Web]] added support for read and save web pages in MHTML since version 3.14.1 released in September 2014.<ref>{{Cite web|url=https://gitlab.gnome.org/GNOME/epiphany/blob/master/NEWS#L1061|title = NEWS · master · GNOME / Epiphany| date=28 July 2023 }}</ref>
 
=== MHT viewers ===
There are commercial software products for viewing MHTML files and converting them to other formats, such as PDF and ePub. Some [[HTML editor]] programs can view and edit MHTML files<!--E.g. free WizHtmlEditor, among others, can edit MHTML-->.
 
==MIME type==
MIME type for MHTML is not well agreed upon. Used MIME types include:
MIME type for MHTML is not well agreed upon. Used MIME types include:
* multipart/related
* <code>multipart/related</code>
* application/x-mimearchive
* <code>application/x-mimearchive</code>
* message/rfc822
* <code>message/rfc822</code>


==Other apps==
== Supporting apps ==
=== Web browsers ===
Some browsers support the MHTML format, either directly or through third-party [[Software:Browser extension|extensions]], but the process for saving a web page along with its resources as an MHTML file is not standardized. Due to this, a web page saved as an MHTML file using one browser may render differently on another.


=== Problem Steps Recorder ===
* '''[[Software:Internet Explorer 5|Internet Explorer 5]]''' was the first browser to support reading and saving web pages and external resources to a single MHTML file.
[[Software:Problem Steps Recorder|Problem Steps Recorder]] for Windows can save its output to MHT format.
* '''[[Software:Microsoft Edge|Microsoft Edge]]''', after switching to the Chromium source code, supports saving web pages as MHTML.
* '''[[Software:Opera (web browser)|Opera]]:''' Support for saving web pages as MHTML files was made available in the Opera 9.0 web browser.<ref>{{cite web|url=http://my.opera.com/desktopteam/blog/show.dml/172375 |title=…and one more weekly! |last=Santambrogio |first=Claudio |date=10 March 2006 |publisher=Opera Software |access-date=2009-05-15 |archive-url=https://web.archive.org/web/20100115001636/http://my.opera.com/desktopteam/blog/show.dml/172375 |archive-date=15 January 2010 |url-status=dead }}</ref> From Opera 9.50 through the rest of the Presto-based Opera product line (version 12.16 as of 19 July 2013), the default format for saving pages is MHTML. The initial release of the new Webkit/Blink-based Opera (version 15) did not support MHTML, but subsequent releases (16 and later) do. MHTML can be enabled by typing "<code>opera://flags#save-page-as-mhtml</code>" at the address bar.
* '''[[Software:Google Chrome|Google Chrome]] 86''' and later can create MHTML files.
* '''[[Software:Yandex Browser|Yandex Browser]] 22.7.4.960''' (July 2022) and later can create MHTML (multipart/related) files.
* '''[[Software:Vivaldi (web browser)|Vivaldi]] 2.3 and later''' can create MHTML files.<ref>{{Cite web|url=https://vivaldi.com/fr/blog/auto-stacking-tabs/|title=Vivaldi Update {{!}} Auto-Stacking Tabs|last1=février 6|first1=Publié sur|last2=Tetzchner|first2=2019-Par Jon von|date=2019-02-06|website=Vivaldi|language=fr|access-date=2019-05-16}}</ref> It supports both reading and writing MHTML files by toggling the "<code>vivaldi://flags/#save-page-as-mhtml</code>" option.
* '''[[Software:Firefox|Firefox]]''' does not support MHTML.<ref>{{cite web|title=Bug 40873 - Save as rfc 2557 MHTML; complete webpage in one file|url=https://bugzilla.mozilla.org/show_bug.cgi?id=40873}}</ref> Until [[Software:Firefox version history#Firefox 52 through 59|version 57 ("Firefox Quantum")]], two [[Software:Browser extension|browser extension]], Mozilla Archive Format or UnMHT, could read and write MHTML files. These extensions are incompatible with version 57 and later.
* '''[[Software:Safari (web browser)|Safari]] 3.1.1 and later''' terminated native support for the MHTML format. Instead, the browser supports the Web Archive format. The [[Software:MacOS|macOS]] version includes a print-to-[[PDF]] feature. As with most other modern web browsers, support for MHTML files can be added to Safari via various third-party extensions.
* '''[[Software:Konqueror|Konqueror]] 3.5.7 and later''' terminated support for MHTML files. An extension project, <code>mhtconv</code>, can be used to allow saving and viewing of MHTML files.
* '''[[Software:NetFront|NetFront]] 3.4''' (on devices such as the Sony Ericsson K850) can view and save MHTML files.
* '''[[Software:Pale Moon (web browser)|Pale Moon]]''' requires an extension to be installed to read and write MHT files. One extension is freely available, MozArchiver, a fork of Mozilla Archive Format extension.<ref name="MozArchiver">{{cite web |title=Pale Moon Add-ons - MozArchiver |url=https://addons.palemoon.org/addon/mozarchiver/ |website=addons.palemoon.org |access-date=3 December 2025 |language=en}}</ref>
* '''[[Software:GNOME Web|GNOME Web]]''' added support for read and save web pages in MHTML since version 3.14.1 released in September 2014.<ref>{{Cite web|url=https://gitlab.gnome.org/GNOME/epiphany/blob/master/NEWS#L1061|title = NEWS · master · GNOME / Epiphany| date=28 July 2023 }}</ref>


=== Save to Google Drive extension ===
=== Other apps ===
The "Save to Google Drive" extension for [[Software:Google Chrome|Google Chrome]] can save as MHTML as one of its outputs.
* [[Software:Problem Steps Recorder|Problem Steps Recorder]] for Windows can save its output to MHT format.
 
* The "Save to Google Drive" extension for [[Software:Google Chrome|Google Chrome]] can save as MHTML as one of its outputs.
===Microsoft OneNote===
* [[Software:Microsoft OneNote|Microsoft OneNote]] 2010 and later can email individual pages as .mht files.
[[Software:Microsoft OneNote|Microsoft OneNote]], starting with OneNote 2010, emails individual pages as .mht files.
* [[Software:Evernote|Evernote]] for Windows can export notes as MHT format, as an alternative to HTML or its own native .enex format.
 
===Evernote===
[[Software:Evernote|Evernote]] for Windows can export notes as MHT format, as an alternative to HTML or its own native .enex format.


== Exploits ==
== Exploits ==
In May 2015, a researcher noted that attackers could build malicious documents by creating an MHT file, appending an MSO object at the end (MSO is a file format used by the [[Software:Microsoft Outlook|Microsoft Outlook]] e-mail application), and renaming the resulting file with a .doc extension.<ref>{{Cite web|url=https://www.securityweek.com/attackers-hide-malicious-macros-mhtml-documents|title=Attackers Hide Malicious Macros in MHTML Documents|last=Kovacs|first=Eduard|date=May 11, 2015|website=SecurityWeek.Com|access-date=April 19, 2019}}</ref> The delivery method would be by spam emails.<ref>{{Cite web|url=https://www.cyren.com/blog/articles/new-tricks-of-macro-malware|title=New Tricks of Macro Malware|last=Mosuela|first=Lordian|date=July 10, 2015|website=Cyren|language=en|access-date=April 19, 2019}}</ref>
In May 2015, a researcher noted that attackers could build malicious documents by creating an MHT file, appending an MSO object at the end (MSO is a file format used by the [[Software:Microsoft Outlook|Microsoft Outlook]] e-mail application), and renaming the resulting file with a .doc extension.<ref>{{Cite web|url=https://www.securityweek.com/attackers-hide-malicious-macros-mhtml-documents|title=Attackers Hide Malicious Macros in MHTML Documents|last=Kovacs|first=Eduard|date=May 11, 2015|website=SecurityWeek.Com|access-date=April 19, 2019}}</ref> The delivery method would be by spam emails.<ref>{{Cite web |last=Mosuela |first=Lordian |date=July 10, 2015 |title=New Tricks of Macro Malware |url=https://www.cyren.com/blog/articles/new-tricks-of-macro-malware |url-status=dead |archive-url=https://web.archive.org/web/20240908105634/https://data443.com/blog/cyren/new-tricks-of-macro-malware/ |archive-date=September 8, 2024 |access-date=April 19, 2019 |website=Cyren |language=en}}</ref>


In April 2019, a security researcher published details about an [[XML external entity attack|XML external entity (XXE) vulnerability]] that could be exploited when a user opens an MHT file. Since the Windows operating system is set to automatically open all MHT files, by default, in Internet Explorer, the exploit could be triggered when a user double-clicked on a file that they received via email, instant messaging, or another vector, including a different browser.<ref>{{Cite web|url=https://www.zdnet.com/article/internet-explorer-zero-day-lets-hackers-steal-files-from-windows-pcs/|title=Internet Explorer zero-day lets hackers steal files from Windows PCs|last=Cimpanu|first=Catalin|date=April 12, 2019|website=ZDNet|language=en|access-date=April 19, 2019}}</ref>
In April 2019, a security researcher published details about an [[XML external entity attack|XML external entity (XXE) vulnerability]] that could be exploited when a user opens an MHT file. Since the Windows operating system is set to automatically open all MHT files, by default, in Internet Explorer, the exploit could be triggered when a user double-clicked on a file that they received via email, instant messaging, or another vector, including a different browser.<ref>{{Cite web|url=https://www.zdnet.com/article/internet-explorer-zero-day-lets-hackers-steal-files-from-windows-pcs/|title=Internet Explorer zero-day lets hackers steal files from Windows PCs|last=Cimpanu|first=Catalin|date=April 12, 2019|website=[[ZDNet]]|language=en|access-date=April 19, 2019}}</ref>


== Alternatives ==
== Alternatives ==
=== data URI scheme ===
=== Data URI scheme ===
The [[Data URI scheme|data URI scheme]] offers an alternative for including separate elements such as images, style-sheets and scripts in-line when serving an HTML request or saving an HTML resource for offline use. Like the embedded content within MHTML, data URIs use [[Base64]] encoding of the external resources (which may be binary or text) to embed them in-line within the HTML markup. HTML pages saved with external elements embedded using the [[Data URI scheme|data URI scheme]] are standard web pages, and can be opened by any modern browser, including browsers not supporting MHTML such as Mozilla Firefox.<ref>{{cite web|url=https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs#browser_compatibility|title=Data URLs - HTTP|website=MDN|language=en|access-date=April 2, 2023}}</ref> Unlike MHTML, saving web pages with their external resources embedded using data URIs requires a third-party extension to be installed in the browser.<ref>{{cite web|url=https://www.ghacks.net/2018/09/03/save-any-webpage-as-a-single-file-in-chrome-or-firefox/|title=Save any webpage as a single file in Chrome or Firefox - gHacks Tech News|last=Brinkmann|first=Martin|date=September 3, 2018|website=ghacks.net|language=en|access-date=April 2, 2023}}</ref>
The [[Data URI scheme|data URI scheme]] offers an alternative for including separate elements such as images, [[Style sheet (web development)|style-sheets]] and scripts in-line when serving an HTML request or saving an HTML resource for offline use. Like the embedded content within MHTML, data URIs use Base64 encoding of the external resources (which may be binary or text) to embed them in-line within the HTML markup. HTML pages saved with external elements embedded using the [[Data URI scheme|data URI scheme]] are standard web pages, and can be opened by any modern browser, including browsers not supporting MHTML such as Mozilla Firefox.<ref>{{cite web|url=https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs#browser_compatibility|title=Data URLs - HTTP|website=MDN|language=en|access-date=April 2, 2023}}</ref> Unlike MHTML, saving web pages with their external resources embedded using data URIs requires a third-party extension to be installed in the browser.<ref>{{cite web|url=https://www.ghacks.net/2018/09/03/save-any-webpage-as-a-single-file-in-chrome-or-firefox/|title=Save any webpage as a single file in Chrome or Firefox - gHacks Tech News|last=Brinkmann|first=Martin|date=September 3, 2018|website=ghacks.net|language=en|access-date=April 2, 2023}}</ref>


=== Mozilla Archive Format ===
=== Mozilla Archive Format ===
Line 135: Line 92:


== See also ==
== See also ==
* [[Data URI scheme|data URI scheme]]
* [[Data URI scheme]]
* [[Mozilla Archive Format]]
* [[Mozilla Archive Format]]
* [[Mpack (Unix)]]
* Web Archive format
* [[Webarchive]]
* [[WARC (file format)|WARC format]]
* [[Web ARChive]]
* [[HTML|HTML format]]
* [[Media type|MIME type]]


== References ==
== References ==
{{Reflist|35em}}
{{Reflist}}


==External links==
==External links==
* [http://www.dsv.su.se/~jpalme/ietf/mhtml.html MHTML standard explained]
* [http://www.dsv.su.se/~jpalme/ietf/mhtml.html MHTML standard explained]
* RFC 2557 (1999)—MIME Encapsulation of Aggregate Documents, such as HTML (MHTML)
* [https://datatracker.ietf.org/doc/rfc2557/ RFC 2557] (1999)—MIME Encapsulation of Aggregate Documents, such as HTML (MHTML)
* RFC 2110 (1997, Obsolete)—MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML)
* [https://datatracker.ietf.org/doc/rfc2110/ RFC 2110] (1997, Obsolete)—MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML)
 


{{web browsers}}
{{web browsers}}
{{Internet Explorer}}
{{Microsoft Office}}
{{Microsoft Office}}


{{DEFAULTSORT:Mhtml}}
[[Category:Archive formats]]
[[Category:Archive formats]]
[[Category:HTML]]
[[Category:HTML]]


{{Sourceattribution|MHTML}}
{{Sourceattribution|MHTML}}

Latest revision as of 06:32, 14 April 2026

MHTML
Filename extension.mht, .mhtml
Internet media type
  • multipart/related
  • application/x-mimearchive
  • message/rfc822
Type of formatMarkup language
Extended fromHTML
StandardRFC 2557 (proposed 1999)

MHTML, an initialism of "MIME encapsulation of aggregate HTML documents", is a web archiving file format used to combine, in a single computer file, the HTML code and its companion resources (such as images) that are represented by external hyperlinks in the web page's HTML code.[1] The content of an MHTML file is encoded using the same techniques that were first developed for HTML email messages, using the MIME content type multipart/related.[2] MHTML files use an .mhtml or .mht filename extension.

The first part of the file is an e-mail header. The second part is normally HTML code. Subsequent parts are additional resources identified by their original uniform resource locators (URLs) and encoded in Base64 binary-to-text encoding. MHTML was proposed as an open standard, then circulated in a revised edition in 1999 as RFC 2557.

The .mhtml and .eml filename extensions are interchangeable: either filename extension can be changed from one to the other. An .eml message can be sent by e-mail, and it can be displayed by an email client. An email message can be saved using a .mhtml or .mht filename extension and then opened for display in a web browser or for editing other programs, including word processors and text editors.

Layout

The header of an MHTML file contains metadata such as a date and time stamp, page title, the source URL, and a unique randomized boundary string for separating resources contained within the file. The boundary string is defined at the beginning and used throughout the file.

From: <Saved by Blink>
Snapshot-Content-Location: https://en.wikipedia.org/wiki/Smartphone
Subject: Smartphone - Wikipedia
Date: Sat, 24 Sep 2022 00:34:32 -0000
MIME-Version: 1.0
Content-Type: multipart/related;
        type="text/html";
        boundary="----MultipartBoundary--GsIBda0vjy2AKIAIliwl7JMwezXDRjDAsLje9khd5l----"

Then, the page resources are contained sequentially, starting with the page's rendered HTML source code. Each resource has its own metadata header which specifies its MIME type and the original location.

------MultipartBoundary--GsIBda0vjy2AKIAIliwl7JMwezXDRjDAsLje9khd5l----
Content-Type: text/html
Content-ID: <frame-D968CEC8BB7E60A1859261A8CA5DFB4D@mhtml.blink>
Content-Transfer-Encoding: binary
Content-Location: https://en.wikipedia.org/wiki/Smartphone

<!DOCTYPE html>

The MHTML file ends with a boundary string that is not followed by any data.[3]

MIME type

MIME type for MHTML is not well agreed upon. Used MIME types include:

  • multipart/related
  • application/x-mimearchive
  • message/rfc822

Supporting apps

Web browsers

Some browsers support the MHTML format, either directly or through third-party extensions, but the process for saving a web page along with its resources as an MHTML file is not standardized. Due to this, a web page saved as an MHTML file using one browser may render differently on another.

  • Internet Explorer 5 was the first browser to support reading and saving web pages and external resources to a single MHTML file.
  • Microsoft Edge, after switching to the Chromium source code, supports saving web pages as MHTML.
  • Opera: Support for saving web pages as MHTML files was made available in the Opera 9.0 web browser.[4] From Opera 9.50 through the rest of the Presto-based Opera product line (version 12.16 as of 19 July 2013), the default format for saving pages is MHTML. The initial release of the new Webkit/Blink-based Opera (version 15) did not support MHTML, but subsequent releases (16 and later) do. MHTML can be enabled by typing "opera://flags#save-page-as-mhtml" at the address bar.
  • Google Chrome 86 and later can create MHTML files.
  • Yandex Browser 22.7.4.960 (July 2022) and later can create MHTML (multipart/related) files.
  • Vivaldi 2.3 and later can create MHTML files.[5] It supports both reading and writing MHTML files by toggling the "vivaldi://flags/#save-page-as-mhtml" option.
  • Firefox does not support MHTML.[6] Until version 57 ("Firefox Quantum"), two browser extension, Mozilla Archive Format or UnMHT, could read and write MHTML files. These extensions are incompatible with version 57 and later.
  • Safari 3.1.1 and later terminated native support for the MHTML format. Instead, the browser supports the Web Archive format. The macOS version includes a print-to-PDF feature. As with most other modern web browsers, support for MHTML files can be added to Safari via various third-party extensions.
  • Konqueror 3.5.7 and later terminated support for MHTML files. An extension project, mhtconv, can be used to allow saving and viewing of MHTML files.
  • NetFront 3.4 (on devices such as the Sony Ericsson K850) can view and save MHTML files.
  • Pale Moon requires an extension to be installed to read and write MHT files. One extension is freely available, MozArchiver, a fork of Mozilla Archive Format extension.[7]
  • GNOME Web added support for read and save web pages in MHTML since version 3.14.1 released in September 2014.[8]

Other apps

  • Problem Steps Recorder for Windows can save its output to MHT format.
  • The "Save to Google Drive" extension for Google Chrome can save as MHTML as one of its outputs.
  • Microsoft OneNote 2010 and later can email individual pages as .mht files.
  • Evernote for Windows can export notes as MHT format, as an alternative to HTML or its own native .enex format.

Exploits

In May 2015, a researcher noted that attackers could build malicious documents by creating an MHT file, appending an MSO object at the end (MSO is a file format used by the Microsoft Outlook e-mail application), and renaming the resulting file with a .doc extension.[9] The delivery method would be by spam emails.[10]

In April 2019, a security researcher published details about an XML external entity (XXE) vulnerability that could be exploited when a user opens an MHT file. Since the Windows operating system is set to automatically open all MHT files, by default, in Internet Explorer, the exploit could be triggered when a user double-clicked on a file that they received via email, instant messaging, or another vector, including a different browser.[11]

Alternatives

Data URI scheme

The data URI scheme offers an alternative for including separate elements such as images, style-sheets and scripts in-line when serving an HTML request or saving an HTML resource for offline use. Like the embedded content within MHTML, data URIs use Base64 encoding of the external resources (which may be binary or text) to embed them in-line within the HTML markup. HTML pages saved with external elements embedded using the data URI scheme are standard web pages, and can be opened by any modern browser, including browsers not supporting MHTML such as Mozilla Firefox.[12] Unlike MHTML, saving web pages with their external resources embedded using data URIs requires a third-party extension to be installed in the browser.[13]

Mozilla Archive Format

The Mozilla Archive Format (MAFF) is a legacy Web archive file format that was supported by Firefox from 2004 to 2018 through an add-on.[14] Unlike both MHTML and data URIs, MAFF uses a ZIP container to preserve both the HTML file and its external elements. In October 2017 the add-on developer announced the format would no longer be supported in future versions of Firefox.[15]

See also

References

  1. "Key Concepts of MHTML". 9 November 2009. https://learn.microsoft.com/nb-no/previous-versions/office/developer/exchange-server-2010/aa563378(v=exchg.140). 
  2. Holden, Amanda. "Difference of HTML & MHTML". https://www.techwalla.com/articles/difference-of-html-mhtml. 
  3. "2. The MHTML File Format - Hunchly Knowledge Base". October 17, 2018. https://support.hunch.ly/article/51-1-the-mhtml-file-format. 
  4. Santambrogio, Claudio (10 March 2006). "…and one more weekly!". Opera Software. http://my.opera.com/desktopteam/blog/show.dml/172375. 
  5. février 6, Publié sur; Tetzchner, 2019-Par Jon von (2019-02-06). "Vivaldi Update | Auto-Stacking Tabs" (in fr). https://vivaldi.com/fr/blog/auto-stacking-tabs/. 
  6. "Bug 40873 - Save as rfc 2557 MHTML; complete webpage in one file". https://bugzilla.mozilla.org/show_bug.cgi?id=40873. 
  7. "Pale Moon Add-ons - MozArchiver" (in en). https://addons.palemoon.org/addon/mozarchiver/. 
  8. "NEWS · master · GNOME / Epiphany". 28 July 2023. https://gitlab.gnome.org/GNOME/epiphany/blob/master/NEWS#L1061. 
  9. Kovacs, Eduard (May 11, 2015). "Attackers Hide Malicious Macros in MHTML Documents". https://www.securityweek.com/attackers-hide-malicious-macros-mhtml-documents. 
  10. Mosuela, Lordian (July 10, 2015). "New Tricks of Macro Malware" (in en). https://www.cyren.com/blog/articles/new-tricks-of-macro-malware. 
  11. Cimpanu, Catalin (April 12, 2019). "Internet Explorer zero-day lets hackers steal files from Windows PCs" (in en). https://www.zdnet.com/article/internet-explorer-zero-day-lets-hackers-steal-files-from-windows-pcs/. 
  12. "Data URLs - HTTP" (in en). https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs#browser_compatibility. 
  13. Brinkmann, Martin (September 3, 2018). "Save any webpage as a single file in Chrome or Firefox - gHacks Tech News" (in en). https://www.ghacks.net/2018/09/03/save-any-webpage-as-a-single-file-in-chrome-or-firefox/. 
  14. "Mozilla Archive Format Add-on - File Format Overview" (in en). https://www.amadzone.org/mozilla-archive-format/. 
  15. "Firefox Addon: MAF - Mozilla Archive Format". https://addons.mozilla.org/en-US/firefox/addon/mozilla-archive-format/. Retrieved 2 April 2023. 
  • MHTML standard explained
  • RFC 2557 (1999)—MIME Encapsulation of Aggregate Documents, such as HTML (MHTML)
  • RFC 2110 (1997, Obsolete)—MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML)