HTML includes

One of the first questions beginners ask when starting to learn HTML is how to do includes. They seldom know that includes is what they are asking about, but instead feels bad when having to copy and paste that same menu HTML each time they want a new page. “Do I have to type the same thing over and over?”.

After asking friends how to solve the problem they get the answer that they now have to learn PHP. Or ASP. Or JSP. Or some other strange language they don’t need. And install this thing here, and that thing there. What does your server host support? Oh, no, you need to configure stuff better. No, that setting is insecure… You know the drill, I’m sure you’ve walked someone through it sometime.

So, a way to include a piece of HTML into a random page would clearly benefit beginners learning HTML. But that’s not all:

  • Browsers would be able to cache components of a page, and therefor load new pages using common components faster.
  • Less work learning new template languages just to find that language X does not have a way to include things.
  • Possible to learn componentization by looking at existing sites and learn from them. This is an area we need to be better in.
  • Easy to make fallbacks by linking directly to the corresponding HTML snippet.

So, how would this be implemented? We need a tag that acts as a kind of include, so what about <object>? Just point to the HTML file you want and voilá, it gets included. Since this is HTML it would work exactly the same across all server-side languages.

Luckily, this is already in the HTML5 spec. At the bottom of the object specification. Iinterestingly, I found it after writing this article… Great minds think alike :)

In this example, an HTML page is embedded in another using the object element.

My HTML Clock

Good! And it even seems to work in current browsers (Thanks Siegfried). I’ve tested it in Firefox, Opera, and Safari, and it works the same in all of them. Internet Explorer 6, 7, 8 (beta 2) just ignores it altogether.

The problem is, the current implementation is to handle them just like iframes. And there’s of course lots of problems with that approach:

  • Currently, an object element without a height and width gets rendered as a 300 x 150 pixel block. There is no reason whatsoever to do this when including HTML. This must change for this to be usable.
  • The included HTML needs to be stylable with the CSS rules on the page it’s included from. Currently, this does not work, included HTML is treated as an iframe. Must be changed if this is to be usable.
  • Are HTML components full HTML pages? Do they include a doctype and a <head>, and do those get included? I assume only HTML inside <body> gets included in the new page. No CSS. No JS linked to in <head>.
  • Clicking links inside included HTML should be handled as if the HTML was on the current page. This follows the same concept as if the HTML was included in the server side, and is needed if this is ever going to be used for a menu.

So, what do you think, is this a good idea? Personally, I’m hoping more concepts from template languages start to move into HTML.

Update: Thanks to the comments (thanks zcorpan) I now have found exactly the above in HTML5. It’s called <iframe seamless>. It meets all the points in my list, and I’m now really looking forward to the first implementation.

22 responses to “HTML includes

  1. Well, object is not new at all. It worked this way at least since html 4. I’m not quite sure about html 3.Although the IE… But that’s always the same.

    There is 1 thing to mention when “including” html via object. The included page is a full html page. It may contain links. By clicking on one of those links only the embedded page changes. This may be useful, but in other circumstances not what you want to.

    Generally including html is not a _that_ good idea. It is a security risk. You know including html via iframe? You know that mainly russian criminals include invisible trojans mainly via iframe? Including html in some other way opens more risks.

    There is a common mechanism including html snippets (f.ex. a navigation menu) on the server side, called SSI (server side includes). No need for php or the like. The page is still delivered as a full html page. This feature was especially meant for including repetitive parts like navigation menues, footers and the like.

  2. @Siegfriend: I had no idea it currently worked in Firefox, Opera and Safari (just tested). Unfortunately, it’s currently unusable for several reasons. I’ll update the post with more info about my findings.

    As you say linking is a problem area that needs to be changed for this to be usable. Added it to the list at the bottom.

    It’s not a security risk, you can either include things on the server or on the client, it’s equally secure. Including russian spyware HTML is stupid, I agree.

    SSI is a language just like PHP and the other. It’s not better in any way.

  3. An object is a separate object, it may be an image, a flash video or a java applet. Or a html page, or anything else your browser can handle. But it is a separate object, the contents of the object is not part of the DOM tree that contains an object.

    I think changing this for the special case of including html in html would be unfortunate.

    It is, of course, possible to do such inclusion with javascript, but I still think baking ready-made pages on the server side is better. The learning curve of plain old server side includes is very flat, and it can be enabled in practically every server there is.

  4. @Rasmus Kaj: Would it be better if it’s a separate tag? Not object? In the spec linked, there already is a section that spells out the special rules for each type of data, the HTML one is empty.

    For beginners, learning includes is often all they need. You can build a perfectly fine site with just that little tool. Problem is, it’s different in every language, and often a hassle to get working (if you’re not a hacker like most of us).

    I believe the advantages of this outweighs that the models gets a little bit different for HTML includes.

  5. Hi,

    i’m currently (for development and testing) using a vrml object. This is a full 3D description format. If there is no vrml plugin available, i need a fallback.

    One possible fallback would be a series of screenshots, navigatable by mouse clicks. So the fallback is not a simple image, but a html file, containing that image and an image map. Since the effect should be somewhat similar to the original vrml, clicking on a part of the image should only change that image. Exactly this is done by including this html file via object.

    So at least for me i consider this objekt behaviour useful. I’d not like to change it.

    O.k. SSI is a language like any other. But SSI can only do 1 thing: Including. So it is extremely more easy to learn than f.ex. php.

    For xhtml there may be more options. You can f.ex. use xsl. And xsl allows for including. And more, xsl allows for building the page at the client side. In this case you don’t even need xhtml, just xml+xsl. And xml itselt offers mechanisms for including (http://xml.silmaril.ie/authors/includes/).

    So, although i think html 5 is a useful intermediate step, the next real step is still xhtml (and maybe simple xml). It offers many benefits.

  6. Ah, just 1 thing more: You know that building a web page of independent blocks would slow down page loading? And thus lowers page performance? Each independent block would need a separate http request. Although with caching this may result in faster loading then. But that depends on caching. And i know many (too many) sites disallowing cacheing.

    So building a page out of parts would be be-fold. It may be an advantage, but, if not well designed, may be a disadvantage.

    For the development process on the other side it would indeed be an advantage, no doupt.

  7. @Siegfried: Concerning the fallback: You could use an object tag, with an iframe inside it that works exactly like you said it would. Frames do contain the clicks inside them.

    Yes, I could use XSL I guess, I just don’t like that language. Again, it’s a new language people have to learn… And although this one works on the client side, it’s still not enough for beginners I think. Good idea though, you really know your way around web technologies.

    About performance: Yes, this would need more http-requests initially, and fewer for subsequent pages. If developers feel this is too much of a performance hit, they can keep on using server-side includes instead.

  8. It would be awesome! It would be a killer argument for not using frames and iframes isntead of learning template languages.

    No decent web developer writes HTML without includes or variables anyway, so to make it a part of the spec only seems reasonable to me.

  9. You should check out the <iframe seamless> feature and its friends in HTML5.

  10. @zcorpan: Thanks, iframe seemless does exactly what I’m looking for. Hoping to see the first implementation soon! I’ve updated the post.

  11. I’m surprised it doesn’t work in IE – I used to use [object] elements in my website (although not wrapped in a [figure] element), and that certainly worked in IE5 and IE6. It didn’t look pretty, but it worked – up to a point.

    I stopped using [object] when I got my head round SSI, which is a much more effective way of achieving the end result. OK, so it doesn’t allow cacheing, but everything else works better – it looks neater, fewer HTTP requests, fewer bytes if the user only looks at one page, easier to maintain, and solves the problem of search engines calling up the object data page as a stand-alone.

  12. One more issue that you have to think about before using this is SEO.
    A search engine’s crawler needs to “know” that it needs to look into the HTML include file and count the text in it as part of the original page and not as a different one. Otherwise your website’s pages might not be indexed correctly (i.e. if you have all of your main menu links inside such an include).
    Search Engines (well, Google…) must first learn to deal with these includes before I will start using them. Until then it’s PHP includes only.

  13. @Alon Peer: Agreed, search engines really need to understand this before people will start to use it widely.

    But even without that it could be immensely useful for prototyping things, or why not an intranet? Things that won’t necessarily need to be crawled.

  14. @Stevie D: You don’t happen to know exactly what you did to make it work in IE? I’d love to see an example.

    “Everything else works better”: I don’t agree with that (especially not neatness), and I’ve given my arguments “pro seemless” in the article.

  15. What I used was
    [p][object data="sourcefile.htm" type="text/html" height="..." width="..."][a href="sourcefile.htm"]Link to source file[/a] for browsers that don't support object element[/object][/p].

    It definitely worked in IE5.5 and IE6, although it looked ugly as heck.

    I’m surprised you and Alon are talking about how it messes up Google. I had always understood that the page was served whole – the browsing agent wouldn’t be aware that some of the code was sent by SSI include. Google would just see the final page, as your browser would, with the included code, erm, included with all the rest of the code.

    Why do I think it works better to use SSI?

    * Neatness – because SSI drops a page fragment into the document flow, you can style it seamlessly with the rest of the page. Object elements have to have the size specified, which often results in scrollbars and ugly borders.

    * Capability – SSI allows you to insert any document fragment, it doesn’t have to be a stand-alone object or set of elements, it could end mid-element if that serves your needs better! Object elements don’t give you that flexibility.

    * Search resilient – because the included fragment is only ever served as part of the parent page, search engines treat it as though it was part of that page. With object elements, you have the risk that spiders won’t read or index pages/fragments linked through the object element, won’t treat them as part of the main page (for ranking and context purposes), and will return them as results in their own right, which means that surfers can be directed to a fragment page that probably doesn’t answer their question and may not work as a self-contained page.

    * Accessibility – because SSI is all done server-side, all the user agent sees is the completed page, so if your coding is up to scratch, it’s accessible straight away. Object elements are not supported by all user agents – some old browsers and, I suspect, mobile phones and assistive technologies, don’t read object elements, so you have to include alternative content and remember to change it as needed, which is a headache to maintain, and adds to the bytes transferred. There is also the worry that, even if the user agent does render the object elements, features such as tabbing to links and fields may not work as intended or in the correct order.

    From reading the [iframe seamless] bit on HTML5, it sounds like they are just reinventing what we’ve already got with SSI, but relying on user agents to treat the included content in the manner specified (do we really want to do that?), whereas with SSI it is guaranteed. The only advantage I can see to using any form of [object] or [iframe] element is to allow cacheing of the included file, but is that really all that important these days?

  16. Yes an object would allow you to include pages within HTML pages, however this has serious implications for search engines. Content within objects, frames and iframes might look right in a browser window, but search engines will not be able to view and parse your content.

    Using PHP or ASP only requires one additional line of code to include another page such as a header or footer. You don’t have to master an entire programming language at all.

  17. @Rob: Search engines would have to update their algorithms, just like browsers would. Nothing says that Google can’t make two requests to a website instead of one. (Minor note: please don’t use your company name as the name, that’s spamming).

  18. @Mikael Lundin: I’ve never seen that one, which probably is one of the explainations for why it isn’t supported: People don’t know about it. I’m hoping for the most probable solution that will do what I listed in the article: and to me, that’s HTML5.

  19. The difference is that XInclude would need a pre-parse before the DOM is loaded, and that is an element in the DOM. XInclude is transparent to JavaScript. IFrame is not.

    XInclude is not intended to be used in HTML, but rather in XML and I actually found it useful when dealing with XML on the server side, and one time including data into a NAnt script :)

    If the browser supports XHtml I can’t see why it should not also support XInclude since it comes from the same family. There seems to be some support in IE7, and I guess thats because of the XSL-support in that browser.

    Well, that’s my two cents anyway.

Comments are closed.