MIND

This article assumes you're familiar with HTML, C/C++
Download the code (3KB)

Dynamic HTML in Internet Explorer 4.0
John P. Grieb

Internet Explorer 4.0 features Dynamic HTML, the next step in Web designability. With Dynamic HTML, you can position items, improve scripting services, and more!
Fasten your seat belts and get ready for Trident—the latest breakthrough in Internet technology to hit the World Wide Web. Trident is the codename for Microsoft's Dynamic HTML technology that will be delivered to users with the release of Microsoft® Internet Explorer 4.0 (IE 4.0). Imagine Web pages that dynamically sort a table of data the way a user chooses, insert a Table of Contents with the click of a button, and allow users to drag images and objects. All this and a whole lot more is possible with Dynamic HTML.

In the Beginning There was HTML…

Until now, users had little control over the style and content of Web pages once they were rendered. The static nature of HTML has limited the quality of user interaction that Internet developers could provide.
HTML was originally developed as a document format to exchange information over the Internet. It provided a platform-independent method for describing a document in a bandwidth-efficient manner. It was also relatively easy to interpret. As the focus of the Web shifted primarily from the scientific and educational communities to the commercial world, HTML was extended to provide richer formatting capabilities. Web pages became more artistic and compelling (largely through the creative use of images), but the underlying style and content of the pages remained fixed.
The advent of scripting and the ability to insert ActiveX
controls and applets gave Web pages their first breath of life. For the first time, users had the ability to interact with the Web. However, this level of interaction was a far cry from what people were used to with traditional applications and games. Until now, client-side scripting has been unable to directly access and change a Web page's HTML, although it could completely overwrite it. User interactions generally required communication with the Web server, which resulted in poor responsiveness.

HTML—The Next Generation

Microsoft has now done for HTML what it did so well (with COM) for the world of PC software: they objectized it. By expanding the current HTML Object Model to include HTML elements themselves, Microsoft has exposed the content and style of Web pages to programmatic access and control. HTML tags, style sheets, text, tables, ActiveX objects, and applets can be dynamically modified without interaction with the Web server. By giving client-side scripting the capability to control the content and style of a Web page, responses to user interaction can be handled on the client's PC, resulting in the almost instantaneous feedback that users have come to expect from software applications.
Data manipulation, formatting, and content changes are no longer solely dependent on Web servers. By capitalizing on the distributed computing model, server resources and bandwidth are conserved. The expanded object model has also enhanced the formatting capabilities of HTML. The location of objects and images can now be precisely specified. Overlapping and the control of an element's z-order are also possible. The movement of images can be achieved without the overhead of an animated GIF or other type of graphic.
There is more to Dynamic HTML than just client-side scripting. ActiveX controls can also access the HTML Object Model, paving the way for third-party custom controls that perform predefined functions on the contents of a Web page (this will be demonstrated in both this article and "Writing Internet Explorer 4.0 Controls with Visual Basic 5.0," by Joshua Trupin, in this issue).

Web Browsing the Object-Oriented Way

Before diving into the HTML Object Model, it's important to understand the role it plays. Although Microsoft Internet Explorer appears to be a single monolithic program, it is actually made up of a group of components. The executable IEXPLORE.EXE is a small program that simply serves as a container, providing a frame to Microsoft's WebBrowser object (SHDOCVW.DLL). This object implements the basic browser features such as forward and back navigation, favorites, page refresh, and printing.
The WebBrowser object itself does not have the ability to display an HTML page or any other type of document. It allows users to view and manipulate documents by acting as an ActiveX document container, loading the ActiveX document server that knows how to process the current URL. By taking this approach, Microsoft has made its browser document-independent—it can be used to view Word documents and Microsoft Excel spreadsheets as well as HTML pages. Moreover, documents can be viewed within their own environments; not only will Internet Explorer display a Word document, it provides the standard Word user interface as well.
Figure 1: HTML Viewer
Figure 1: HTML Viewer

Figure 2: Executing Scripts

In the case of Web pages, the WebBrowser object loads Microsoft's HTML Viewer (MSHTML.DLL). This ActiveX document server provides the functionality needed to interpret, display, and provide programmatic access to HTML pages (see Figure 1). This entree to the HTML pages is achieved through client-side scripting.
While the HTML Viewer knows HTML, it doesn't know how to execute code. When a <SCRIPT> tag is detected within a Web page, the HTML Viewer enlists the services of a scripting engine. The appropriate engine is loaded based upon the script's language (VBSCRIPT.DLL for Visual Basic
® Script or JSCRIPT.DLL for JScript and JavaScript). Once loaded, the HTML Viewer passes the code contained within the <SCRIPT> element to the scripting engine, where it is executed (see Figure 2).
Client-side scripting was developed as a method of accessing the properties and methods of objects and acting upon their events. As the host of the scripting engine, the HTML Viewer is responsible for exposing the objects that the script operates on. In addition to exposing a Web page's ActiveX controls, the HTML Viewer maintains and exposes a set of objects that allow access to elements of the browser and the content of the current Web page. Collectively, these objects are referred to as the HTML Object Model. The HTML Viewer exposes this object model as a set of COM interfaces. ActiveX controls within the Web page can also access the object model through these interfaces.

The Dynamic HTML Object Model

The top two tiers of the Dynamic HTML Object Model, illustrated in Figure 3, consist of seven types of objects and one collection. The foundation of the object model, the window object, represents the browser's window. All other objects inherit from, and are accessed through, this one object.

Figure 3: The Dynamic HTML Object Model

Additional window objects are used to represent child windows, more commonly referred to as frames. These window objects are accessed through the frames collection. The properties, methods, and events of the browser window and frame windows are exposed through window objects. (Note that the object tables presented throughout this article will use bold type to highlight events, properties, methods, and collections that are not part of the current HTML Object Model. Since this information is based on a preview release, names and functionality are subject to change.)
Window Object
Events onfocus, onload, onunload, onblur, onerror, onhelp, onbeforeunload
Methods alert, confirm, prompt, open, close, redraw, setTimeout, clearTimeout, navigate, blur, eval, focus, item, scroll, doPaint, showHelp, showModalDialog, recalc, getMember
Properties name, parent, opener, self, top, defaultStatus, status, history, location, navigator, document, history, event, visual, fullscreen, length, dialogArgs, dialogReturn, closed, menubar, toolbar, directory, scrollbar
Collections frames
The history object exposes methods that allow navigation through the browser's history list (previously visited URLs). It supports properties that provide access to the current URL and the length of the history list.
History Object
Methods back, forward, go
Properties length, current, getMember, setMember
The location object provides a method that causes the window's current URL to be reloaded and properties that allow the current URL to be parsed and modified.
Location Object
Methods reload, getMember, setMember
Properties href, protocol, host, hostname, port, pathname, search, hash
The navigator object exposes properties and collections that provide information about the browser. It also supports a method that allows a new URL to be loaded, replacing the current URL in the history list.
Navigator Object
Methods replace, getMember, setMember
Properties appCodeName, appName, appVersion, userAgent, javaEnabled, taintEnabled, cookieEnabled
Collections mimeTypes, plugins
The screen object provides information about the client's screen and its rendering abilities.
Screen Object
Methods getMember, setMember
Properties bufferDepth, colorDepth, hres, vres
The event object exposes the properties associated with the event that is currently being processed (such as a mouse click or key press). It also provides the ability to cancel the event's default behavior and to prevent additional event handlers from being executed.
Event Object
Methods getMember
Properties keyCode, fromElement, toElement, button, zoomPercent, cancelBubble, srcElement, x, y, shiftKey, ctrlKey, altKey, returnValue

The Dynamic HTML Document Object

One of the primary differences between the Dynamic HTML Object Model and the model currently supported by IE 3.0 is in the definition of the HTML document object (see Figure 4). The document object represents the Web page currently displayed within the window. It exposes the contents of the page through a series of properties, methods, events, and collections.
Document Object
Events onclick, ondblclick, onkeydown, onkeypress, onkeyup, onload, onmousedown, onmousemove, onmouseout, onmouseover, onmouseup, onmousewheel, onreadyStateChange, resize, onselectionchange, onhelp, onzoom, onenter, onexit, onbeforeupdate, onafterupdate
Methods write, writeLn, open, close, clear, save, new, rangeFromText, elementFromPoint, execCommand, rangeFromElement, rangeFromPoint, queryCommandState, queryCommandEnabled, queryCommandText, queryCommandSupported, execCommandShowHelp, queryCommandIndeterm, createElement, getMember, setMember
Properties linkColor, aLinkColor, vLinkColor, bgColor, fgColor, location, lastModified, title, cookie, referrer, body, domain, readyState, saved, scrollbars, keepScrollbarsVisible, zoom, URL, link, vlink, aLink
Collections anchors, links, forms, all, applets, frames, images, scripts, styleRules
Element objects are used to represent the HTML tags contained within the Web page. Every element object exposes a common set of properties and methods. In addition, each object exposes additional properties, methods, and events that are unique to the type of tag it represents.

Element Object
Methods scrollIntoView, contains, removeMember, getMember, setMember
Properties id, name, parentElement, tagname, Class, top, left, sourceIndex, style, form, document__
The document object exposes these element objects through a set of collections as illustrated in Figure 4. A collection is an array of element objects that are ordered based on the relative positions of their tags within the HTML source. Collections expose a read-only length property that returns the number of element objects in the collection. They also expose two methods.
Figure 4: The Dynamic HTML Object Model
Figure 4: The Dynamic HTML Object Model

The item method provides access to the individual element objects within the collection. It takes a single parameter which can either be the element's name, id, or position:
 scripts.item(3).property 

images.item("logo").method
This method is the collection's default method and can be invoked implicitly:
 scripts(3).property

images("logo").method
The tags method returns a collection of element objects which represent the specified type of tag:
 IFrameTags = all.tags("IFRAME")

AreaTags = links.tags("AREA")
The all collection consists of every tag contained within the Web page whether or not the HTML Viewer recognizes them. The anchors, frames, forms, images, links, scripts, applets, and styleRules collections consist of subsets of the all collection—each one containing the element objects that correspond to its specific set of HTML tags (see Figure 4).
The forms collection is composed of <FORM> tag element objects. Each of these objects exposes an elements collection that consists of the <INPUT>, <SELECT>, <OBJECT>, <APPLET>, <TEXTAREA>, and <IMG> element objects contained within the form.
The body object provides access to the textual content of the Web page through the createTextRange method. It also supports the align method as well as the methods and properties common to all element objects.
Body Object
Methods createTextRange, align, scrollIntoView, contains, removeMember, getMember, setMember
Properties id, name, parentElement, tagname, Class, top, left, sourceIndex, style, form, document
The Selection object represents the range of the Web page that is currently selected by the user.
Selection Object
Methods clear, createRange, empty
Properties type
The HTML Object Model represents tables using a hierarchy of objects as illustrated in Figure 5. Each element object that represents a <TABLE> tag exposes a collection of row objects that represent the table's <TR> tags. Each row object, in turn, exposes a collection of cell objects that represent the table's <TH> and <TD> tags.
Figure 5: Tables in the Dynamic HTML Object Model
Figure 5: Tables in the Dynamic HTML Object Model

TextRange Objects

Programmatic manipulation of the content of Web pages is accomplished through the use of TextRange objects. These objects represent the stream of text that is associated with a particular tag. The <BODY>, <DIV>, <MARQUEE>, <TD>, and <TH> element objects expose a createTextRange method, which creates and returns a TextRange object.

TextRange Object
Methods queryCommandSupported, duplicate, inRange, isEqual, collapse, expand, move, select, moveEnd, moveStart, parentElement, isEmbed, queryCommandIndeterm, setRange, scrollIntoView, commonParentElement, execCommand, queryCommandEnabled, queryCommandText, queryCommandState, pasteHTML, execCommandShowHelp, getMember, setMember
Properties text, htmlText, end, start, htmlSelText
The object's htmlText property allows the HTML source within the range to be read, while its pasteHTML method allows HTML source to be inserted into the range. The text property allows non-HTML text to be read from and written to the range. As an example, the line of code
 MyText.pasteHTML("<P>This is a test</P>")
would generate the HTML "<P>This is a test</P>" and display "This is a test" in a browser. Whereas the line of code
 MyText.text = "<P>This is a test</P>"
would generate the HTML "&lt;P&gt;This is a test&lt;/P&gt;" and display "<P>This is a test</P>" in a browser.
TextRange objects expose a group of move methods that allow the text to be walked through by character, word, or sentence. They also expose a group of find methods that provide the ability to search for specific text or the text associated with a specific element or tag. Modifying the start and end properties or invoking the expand or collapse methods allows the amount of text included in a range to be changed. The execCommand method provides the ability to change the attributes of the range's text (such as bolding or font size) without requiring knowledge of HTML.
The selection object, which is exposed by the document object, supports a createRange method that creates a TextRange object from a copy of the user's current selection.

Event Model

The Dynamic HTML Object Model incorporates an event model that allows events to bubble up through the document's element hierarchy. That is, they are processed first by the target element and then by each of its parent elements until it is finally processed by the document object itself. All elements fire the following standard keyboard and mouse events:


 onkeydown    onmousedown    onmouseover    ondblclick
 onkeypress   onmousemove    onmouseout     onclick
 onkeyup      onmouseup        
In addition, elements that are capable of receiving focus also fire the onenter, onexit, onfocus, and onblur events.
There are other events, specific to individual types of elements, that are not included in the event model. These are excluded mainly because their uniqueness does not lend itself to the "generic script processing" which the bubbling event model provides.
The model can best be illustrated by example. The HTML page shown in Figure 6 displays two images and contains script that processes the images' click events at different levels of the document. Clicking on either image will bring up a series of three message boxes indicating the id of the image that was clicked on and the level at which the event was processed. The event model lets you write a single function that can process the events of all elements or a group of elements based on their ids or names (such as window.event.srcElement.id or window.event.srcElement.name). The srcElement element object exposes all of the properties and methods of the element that it represents.
The following code snippets illustrate other methods that can be used to associate an event handler with an element:
 <SCRIPT language=VBScript>
      sub picture1_onclick()
      msgbox("Element was processed at the document level")
      end sub
 </SCRIPT>
 <IMG id=picture1 src="Image1.gif">
 
 <SCRIPT language=VBScript>
      sub EventHandler()
      msgbox("Element was processed at the document level")
      end sub
 </SCRIPT>
 <IMG onclick="EventHandler()" src="Image1.gif">
A key part of the event model is the introduction of the event object, which is exposed by the window object. This global object is used to furnish information about the event that is being processed. As an example, if the onkeypress event were being processed, the altkey, ctrlkey, and shiftkey properties could be read to determine if the Alt, Control, or Shift keys were also pressed.
In addition to furnishing information, the event object provides two important functions. First, by setting the event object's cancelBubble property to true, processing of the event will be prevented from bubbling any further upward. In other words, the event handlers of subsequent parent elements will not be executed. If the script from the original event model example were modified as shown in Figure 7, then clicking on the first image would still result in a series of three message boxes. However, clicking on the second image would only cause two message boxes to be displayed. The <TD> element's click event handler cancels the bubbling of the event if the image picture2 is clicked on.
The second function that is provided by the event object is the ability to cancel the event's default action. Every event is associated with a default action that is dependent upon the type of element that the event was fired over. In the case of clicking on an anchor, the browser would jump to the anchor's link (represented by its href property). Setting the event object's returnValue property to false disables an event's default action. The following code would require a user to double click on the anchor in order to jump to its link:
 <html>
 <body>

<SCRIPT language=VBScript for="anchor" event="onclick()"> window.event.returnValue = false </SCRIPT>
<SCRIPT language=VBScript for="anchor" event="ondblclick()"> window.location.href = window.event.srcElement.href </SCRIPT>
<A id=anchor href="http://www.microsoft.com"> Hyper-link</A>
</body> </html>
Applications of Dynamic HTML

Now that I've covered the basics of the Dynamic HTML Object Model, it's time to tie it all together with some examples.
Image Dragging The first example demonstrates the ability to drag images across a Web page. The <DIV> tag is used to define a container that is 600 pixels wide by 400 pixels tall. The body of the document is itself a container; it is only necessary to define a subcontainer if you wish to restrict the layout to a portion of the page. The style POSITION:ABSOLUTE is used to indicate that the coordinate system is relative to the page. If the style POSITION:RELATIVE were used, then the coordinate system would be relative to the container.
There are two images within the container, each of which has an absolute width, height, and z-index. The z-index is used to define the image's layer. An image with a higher z-index will appear behind an image with a lower z-index.
Images are moved by first capturing the mousedown event that occurs when the user presses the mouse button. When this occurs, the mouse position and the id of the image under the mouse are recorded. As each mousemove event occurs, the change in the mouse's coordinates is used to adjust the image's left and top position.
Code has also been included to prevent the images from being dragged out of the container. The left position is always kept less than or equal to the width of the container minus the width of the image. The top position is always kept less than or equal to the height of the container minus the height of the image.
In addition, if the mouse moves outside of the container's boundaries, the current image id is cleared. This causes future mousemove events to be ignored. If a portion of an image were to fall outside the boundaries of the container, it would be clipped (that portion of the image would become invisible).
In Figure 8, the user is prevented from dragging the second image. This is accomplished by trapping and canceling the image's mousedown event, which prevents the generic mousedown event handler from receiving and processing the event.
Search & Replace The second example demonstrates the ability to find and replace text within a Web page (see Figure 9). Edit controls allow the user to enter the search and replacement strings. A button is used to trigger the search and replace function.
The first step is to create a TextRange object containing all of the <BODY>'s text. Then the search string is looked for within the range. If found, the string is stored in another TextRange object. This subrange is then selected and the user queried as to whether he wants the text to be replaced.
If the user clicks Yes, the text of the subrange is replaced with the replacement string. The Start position of the primary TextRange object is then moved past the subrange and the subrange text is deselected. Finally, another occurrence of the search string is looked for within the primary TextRange object. Not finding another occurrence or clicking on Cancel will cause the while loop and the script to be terminated.
Search & Replace ActiveX Control For the final example, I have implemented the search and replace functionality within an ActiveX control using Visual C++® 4.2b along with include and library files from the Internet SDK (see Figure 10). The first step was to use the AppWizard to create the base control. I then used the ClassWizard to add FindString and ReplaceString properties that I stored internally in CStrings. Finally, I used the ClassWizard to create an Execute method and inserted code that __performs the search and replace function.
Implementing the Execute method is straightforward. The client site's getcontainer function is used to retrieve a pointer to the interface that represents the Web page. This, in turn, is used to obtain a pointer to the document object's interface. From this, I obtain a pointer to the body object's interface, which is used to create a TextRange object containing all of the <BODY>'s text.
The remainder of the code follows the script from the previous example very closely. The put_text function calls require their string arguments to be of type BSTR; this is accomplished by using the CString's AllocSysString method. Before returning from the method, all of the allocated interface pointers must be released. Finally, in an effort to save space I did not check the return values from each of the functions calls. This is not good programming practice—each OLE function returns an HRESULT which should always be checked for error codes.
As you can see from the Web page's HTML (see Figure 11), very little scripting is required. When the user clicks the button, the FindString and ReplaceString properties are set and the Execute method is invoked. This is a good example of an off-the-shelf control that a Web developer could use to implement search and replace functionality without needing to understand how to program Dynamic HTML.

The Rest is Up to You

Dynamic HTML will open up a new era for the World Wide Web. Dramatically increased user interaction, powerful control over the content and style of pages, and significantly reduced bandwidth will create opportunities for Web developers that previously have only been possible with software applications. The new and exciting possibilities are limited only by your imagination.

From the July 1997 issue of Microsoft Interactive Developer. Get it at your local newsstand, or better yet, subscribe.

© 1997 Microsoft Corporation. All rights reserved. Legal Notices.