XHTML Web Design for Beginners

An introduction to designing web pages using XHTML aimed at beginners.

Technology & Innovation

A computer screen displaying the basics of web development
Technical Level: Basic/Beginner Published:
Author: Nigel Peck Last Updated: -

This article continues at part two.

Introduction

This article is for readers who have either no prior experience of Web Design or very little. If you have dabbled with exporting HTML from Microsoft Word, or played around with FrontPage a little and want to understand what you are doing then this article is for you. I will teach you what XHTML is and how you can use it to start producing the next generation of Web pages.

If you have difficulty with any part of this article or can't get an example to work feel free to contact MIS Web Design. I'll do my best to answer you as quickly as possible.

If you want to skip this introduction and get on with it feel free. Just go to the Hello World section and get started. But please come back and read the rest of this introduction later when you have time.

Colour

I have used colour in the example XHTML throughout this article to make it easier for you to understand the code. The colour is purely there for this reason and serves no other purpose.

No Programs

I will not be showing you how to use any programs to write XHTML for you. I have a firm belief that the best way to write Web pages is to get your hands dirty and write the code yourself. I've been doing it for seven years so far and it hasn't let me down yet. Here are the main reasons I believe this.

Programs that produce HTML for you often do so badly. What I mean is that they often produce Web pages that go the long way round about doing things. When you code your pages by hand you have an intimate understanding of what you are doing and can make the actual size of the Web page file as small as possible. This reduces download times so your pages load quicker and your users are happier.

When you use a program to generate HTML for you, you do not understand how your page is built internally because it does it for you. This is not a problem as long as everything works. But what about when it doesn't? If you find that your Web page doesn't display properly in Internet Explorer 4, and many of your users use that browser, you are going to have to sort it out. This means forgetting about the program and looking at the code yourself. Do you see the problem? You've been using the program to code the page for you so when the problem occurs you haven't got the knowledge you need to fix it. And problems will occur.

The Internet is no longer limited to people with computers viewing Web sites through one or two different Web browsers. Everything has a Web browser in it these days. Mobile phones, Televisions, Personal Digital Assistants, Cars, even fridges. Blind users "view" Web sites using speech synthesis or Braille devices. There is no way you can test each page you produce in all of the possible ways it may be used. But there is a way to give you the best chance that they will work. This is achieved through producing pages using the standards laid out by the World Wide Web Consortium (W3C), the people who work on XHTML and other Internet standards. Once you have produced your pages the W3C provide a validation service to check that your page meets the standards and therefore has the best chance of being used on any device. I do not know of any HTML generation programs that produce valid code.

I hope that has persuaded you that the learning curve for XHTML is worth it. If you decide to use a program to do it then that will have a learning curve too, so you might as well take the code option and save yourself hassle in the future.

Why XHTML?

Since 1990 HTML or Hyper Text Markup Language has been the language recommended for writing Web pages in. And it has been very successful (you didn't need me to tell you that). But HTML has its problems. Without going into specifics, as it's not the subject of this article, HTML has become a mess. To sort this mess out the World Wide Web Consortium, the standards body for the Web, came up with XHTML in 1999. XHTML stands for eXtensible Hyper Text Markup Language and is written in a language called XML or eXtensible Markup Language.

As the name implies XHTML has the capability of being extended. You can use extra modules to do things with your pages that weren't possible with HTML. The long-term goal is that your Web pages will be able to be understood by computers as well as humans. If this doesn't make sense, allow me to explain.

You may be thinking that computers already understand Web pages because you use a computer to view them. This is true. But computers only understand how to display your pages, not what they mean. Imagine if computers understood what they meant, you could tell your computer to go and visit all of your local supermarket's Web sites and tell you which one is the cheapest for this weeks shopping. Your computer could visit the news sites around the world and bring back the latest headlines that relate to things you are interested in. The possibilities are endless.

Hopefully you now see why XHTML is important. I decided to write this tutorial to teach you XHTML from scratch. The main reason for this is that I couldn't find a beginners XHTML tutorial anywhere, there are plenty of HTML beginner's articles, and plenty of XHTML introductions for those who can already do HTML, but it seems logical to me that if you are starting learning Web Design now then you might as well use XHTML from the word go. So if you're still with me, go.


Hello World

No beginners guide would be complete without showing you how to say "Hello World". With XHTML this is pretty simple. Don't worry if you don't understand everything, it will all become clear in time. Your "Hello World" Web page code looks like this:

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd">
<html xml:lang="en" lang="en" xmlns="http://www.w3.org/1999/xhtml">
	<head>
		<title>Hello World</title>
	</head>
	<body>
		<p>My first Web page.</p>
	</body>
</html>

In a visual browser such as Internet Explorer the page above would look something like this:

A Microsoft Internet Explorer window. The title bar contains the text "Hello World". The page contains the text "My first Web page.".
Figure 1-1
View Figure 1-1

We are not going to worry about the code that is a grey color for the time being. All you need to know for now is that it tells the computer that this page is in XHTML and the language used is English. This code needs to be in every page that you produce and release on a live Web site but I'm going to leave it out until we deal with it later to help you learn without it getting it the way. Don't sweat it.

XHTML is called a Markup language because that's what you do with it. You mark up areas of text to indicate what they mean so the browser can know what to do with them. This is done by using elements. An element consists of two tags, an opening tag and a closing tag. Tags use the angle brackets < and > to show they are tags, and the closing tag also has a slash /.

Let's look at the document we just saw to demonstrate this. The <title> element is used to indicate the title of a page. In Internet Explorer this is displayed in the bar at the top of the window. Our title element looks like this:

<title>Hello World</title>

The <title> tag means we are starting a new title element. This is then followed by the text that we want the title to be. In this case the title will be "Hello World". To tell the browser that we have finished with the title we use a closing tag of </title>. As mentioned above the only difference between a start tag and an end tag is the slash /. This is essential as it is the only way the computer knows whether you are starting a new tag or finishing a previous one.

The name of the opening and closing tags must be the same, so:

<title>Hello World</heading>

is invalid and will not work.

As well as containing text such as "Hello World" above, elements can contain other elements. If we look just outside the <title> element we can see that it is inside a <head> element like so:

<head>
	<title>Hello World</title>
</head>

This means that the <title> is part of the <head> of the document, because it is inside it. There is no limit to how many elements another element can contain, as long as you follow the rules that we will look at later.

The <head> of a document is used to tell the computer things about your document rather than things that should be in it. The <title> is not part of the page itself; it describes what the document is, so it goes in the <head>. All XHTML documents must have a <head> element that must contain one <title> element, although others are allowed that we will look at later.

After the <head> comes the <body>. The <body> is the part of the document that contains the page itself. All XHTML documents must have one <body> element. The body contains things like paragraphs, bulleted lists, pictures and links to other documents. All of the stuff you view when you visit a site is contained in the <body> element.

Our <body> element is very simple; it contains a single element <p>:

<body>
	<p>My first Web page.</p>
</body>

Have you guessed what the <p> element is used for? The <p> element is used to mark a paragraph, so our page will have one paragraph with the text "My first Web page." in it. If we wanted add another paragraph we could do it like this:

<body>
	<p>My first Web page.</p>
	<p>I hope you like it.</p>
</body>

In a visual browser such as Internet Explorer the page above would look something like this:

The page section only of a Web browser window. The text "My first Web page", followed by some space, and below it the text "I hope you like it.".
Figure 1-2
View Figure 1-2

There's one more essential ingredient that we haven't covered. The <head> and <body> elements are contained by an element <html> which contains the entire document (the <head> and <body> elements). Our <html> element above looks like this:

<html>
	<head>
		<title>Hello World</title>
	</head>
	<body>
		<p>My first Web page.</p>
	</body>
</html>

The <html> element must contain one <head> element and one <body> element.

You may be wondering why there is extra space at the start of some of the lines. This is purely for our benefit and makes no difference to the computer processing your pages. The idea is to add tabs or a set amount of spaces at the start of each line to match the level of your tags. Look at the code above, <html> is not contained in any element so there is no space. <head> is container by one element, <html>, so it has one tab. <title> is contained by two elements, <html> and <head> so it has two tabs, and so on. Trust me, when your documents get big, it makes life a lot easier.

Now it's your turn

If you feel up to it, have a go at doing some pages yourself before reading any more. If not just skip to the next section.

First of all try the "Hello world" example that we just looked at. Here's how.

Open up a text editor of your choice. If you're using Windows then

Start > Programs > Accessories > Notepad

will get you into Notepad, but any text editor will do. Please note that Microsoft Word and other Word processors are not text editors and are not suitable for this task.

Now type in the code below. I recommend that you type the code in yourself rather than copy and paste as it will help you to understand what you are doing. The tab key (for the spacing) is usually located above "Caps Lock" on the left of your keyboard.

<html>
	<head>
		<title>Hello World</title>
	</head>
	<body>
		<p>My first Web page.</p>
	</body>
</html>

Once you have typed the code into your text editor you will need to save it as a Web page file. Web page files have their own "extension" (the period and the three letters after the file name) to distinguish them from other files such as Microsoft Word (.doc) or Adobe Acrobat (.pdf).

Web pages use an extension of either .htm or .html. I prefer to use .html as it matches the name of the language. The choice is yours. Some old systems will not save files with four letter extensions so .htm may be your only choice.

Once you have saved the file open it up in your Web browser. On windows this can usually be done by double clicking the file in Windows Explorer. If you have typed it in correctly then you will see something similar to Figure 1-1 above.

Now that you have your page, try adding some more paragraphs to it like this:

<html>
	<head>
		<title>Hello World</title>
	</head>
	<body>
		<p>My first Web page.</p>
		<p>A second paragraph.</p>
		<p>Yet another paragraph.</p>
	</body>
</html>

Save your document again and refresh your Web browser. You should see the extra paragraphs appear after the first one.

Summary

That's it for your "Hello World" page. As I said when we started, don't worry if you didn't take it all in, we're going to be looking at each area in greater detail, but hopefully that has given you an idea of how Web pages work. In the next section, we're going to take a closer look at elements and tags and how they are used to build your documents.


XHTML Building Blocks

Elements and tags are the building block of XHTML. You need to fully understand both of these concepts to be able to write Web pages properly. We already touched on how they work in our example above but we're going to take a closer look now.

An element is used to mark sections of your document in order to tell the computer what that section is. This can range from marking the entire document as with the <html> element to marking a single word as important. The concept is the same in all cases.

Elements

Elements are made up of two tags; a start tag and an end tag. Between these tags is the element content.

XHTML Element Syntax. <title>Hello World</title>

This element tells the computer that its content "Hello World" is the title for the document. Without the start and end tags the computer would have no way of knowing what to do with this text.

Start Tags

A start tag is made up of a left angle bracket followed by the name of the element and then a right angle bracket.

XHTML Start Tag Syntax. <title>

A start tag tells the computer that we are starting a new element and that it should regard everything it now encounters as part of that element start tag until it reaches the right angle bracket.

End Tags

End tags are made up of a left angle bracket and a slash followed by the name of the element and then a right angle bracket.

XHTML End Tag Syntax. </title>

Once the computer gets to the end tag for an element it knows that element is finished. The slash is necessary to distinguish it from the start tag.

Case Sensitivity

When you are entering your tags you must make sure that the names use lower case letters only. XHTML is what we call case-sensitive. This means that the following tags are all different:

  1. <title>
  2. <Title>
  3. <tITLE>

Only number 1 is an XHTML tag, the rest do not exist. All tags in XHTML are in lower-case so it is not difficult to remember, just be careful and make sure you get it right.

Empty Elements

Certain elements do not have any content. For these empty elements a special syntax is provided. Instead of inserting an end tag immediately after the start tag has finished all we have to do it put a slash before the right angle bracket of the start tag to tell the computer that this element is finished.

The <br> element is used to insert a line break into your document. This tells the computer to stop the text at that point and start a new line. As you may have guessed the <br> element does not have any content so instead of entering the element like this:

<br></br>

we use a single tag with a slash at the end of the tag to show that it is an empty element:

XHTML Empty Element Syntax. <br />

Not only does this save typing, it also makes the code easier to read and more manageable. The space before the slash is necessary to support older Web browsers that do not understand empty elements and will simply ignore the slash as long as there is a space before it.

Content

The element we have just looked at only contained the text "Hello World". But elements can contain a lot more than just text. If they couldn't then XHTML wouldn't be very useful.

Other than text, most of your elements will also contain other elements. In fact a number of elements must contain certain other elements to work properly. We will look at each of these later.

An element that contains another element looks like this.

<head>
	<title>The document title</title>
</head>

Here we have a <head> element that contains a <title> element. As we go on you will see elements containing more and more elements as you build up your knowledge and produce larger, more complex documents.

Nesting

No we're not talking about preparing for babies. Nesting means the way in which elements contain elements. When we say that elements are properly nested we mean that each element is completely contained within the elements that contain it, and it completely contains the elements it contains. Try and say that after a night out.

That might sound confusing, but it's really quite simple, as these examples will demonstrate. We are going to be using the elements <em> and <strong> which give text emphasis and strong emphasis respectively. We'll look at them in detail later.

<em>The Lord Of The Rings is a <strong>fantastic</strong> story.</em>

This is valid XHTML.

<em>The Lord Of The Rings is a <strong>fantastic</em> story. </strong>

This is not. The <em> starts outside the <strong> but finishes inside it. The tags are not properly nested. Think of elements as being like boxes. A box can have a box inside it, or can be inside a box, but it can't be inside a box and outside it as well. Neither can your elements.

Required Elements

There are four elements that all XHTML documents must contain. We have already seen that you must have a <head> and it must contain a <title>. I've also mentioned the <html> and <body> elements. We're going to look at each of these elements in turn, starting from the top.

<html>

The <html> element is the container for your whole document. It starts first and finishes last. It tells the computer that this is an <html> document and must always be present.

<head>

After <html> the next element should always be <head>. The head contains elements that are about the document rather than elements that are displayed in the page itself. This includes things like the document title, information to be given to search engines and how this document relates to others on your site.

<title>

Within the <head> of your document you must have a <title> which describes what the document is. Without a <title> Your document is not valid.

<body>

Finally your document must have a <body>. The <body> is the Web page itself. It comes after the <head> and is the only other element that can go in your <html> element. Anything that you want to put in your page goes in here.

You can think of an XHTML document as being like a human body. All people are people from head to toe (<html>), they have a head that contains information you don't see when you look at them (<head>), they have a name (<title>) and they have a body (<body>).

Putting them all together

When we put all of these together we get the basic structure for an XHTML document. Here it is.

XHTML Document Basic Structure. <html>, <head>, <title>, </title>, </head>, <body>, </body>, </html>.

Every XHTML document you produce will have that same basic structure. All other elements go in either the <head> or the <body>.

Attributes

Often an element can't convey enough information about itself through its name alone. For example, the <img> element, which is used to display an image, is no use on it's own. You also need to tell the browser where to find the image file, and other things like a text description for users who don't get the image for one reason or another.

This is achieved with attributes. Attributes are added to the start tag of your element and come in the form of a name="value" pair. The name is the name of the attribute you are using, value is replaced with the value you wish to provide for the attribute. Let's take a closer look.

XHTML Attribute Syntax. name="value"

As with elements names, all attribute names are in lower case. You have a choice of using either double quotes " or single quotes ' as long as you use the same before and after the value. You must enclose the value in one form of quotes or the other. Without them your document will not be valid and may not work as intended.

Let's look at an example to see an attribute in action. Below is a simple <img> element that tells the browser to fetch an image from /images/logo.gif.

<img src="/images/logo.gif" />

You will see attributes used a lot and you'll soon get the hang of them so again, don't sweat it.

Summary

We have seen that there are rules to be followed when writing your XHTML documents, and we've looked at the basic building blocks of XHTML. As long as you follow these rules, plus others that I will mention as we go along, you are on your way to creating XHTML web pages. We're now going to add some elements to your arsenal that are used to mark up text.


Text That Says Something

Congratulations! What for? For getting to here, you've got past the hardest part. Whether you understood everything you read so far or just absorbed as much as you could, the next few sections should be a lot easier going as we look at the different elements in your XHTML arsenal and the meaning that they have.

We're going to start with giving more meaning to your text. This includes:

Marking Paragraphs with <p>

Before we dive into those, let's take another look at the paragraph element <p>. The <p> element is used to contain your paragraphs. It is what we call a block or box element. This means that when it occurs in your document (in a visual browser) it will start on a new line, and when it finishes the next element will start below it. This is best described by the example below.

Take a look at the code below which you have already seen in our first example.

<body>
	<p>My first Web page.</p>
	<p>I hope you like it.</p>
</body>

Here we have two paragraphs. Let's take another look at the way in which they would be displayed to understand what the <p> element is doing. I've added three blue bars to the picture to highlight the spacing and the new line that has been created from using the <p> element.

The page section only of a Web browser window. The text "My first Web page", followed by some space, and below it the text "I hope you like it.". The areas above, between and below the text blocks are highlighted.

Without the <p> elements there would be no spacing and the text would just be in one long line.

Try it with the <p> elements
Try it without the <p> elements

This kind of element is called a box or block element because there is a (often invisible) box around the element that separates it from the rest of the page. This is essential to make your document readable instead of just being one big kludge of text.

The second type of element is called an inline element, this is an element that does not have it's own box, it does not effect the flow of text in any way. The elements we are looking at in this section are inline elements unless otherwise stated.

Now let's add some further meaning to our text.

Adding Emphasis with <em>

First let's look at <em>. <em> is used to indicate text that should be given greater emphasis. It is more important than the text around it. In the paragraph below the phrase "The Lord Of The Rings" is considered more important so it is given more emphasis using <em>.

<p><em>The Lord Of The Rings</em> was written by JRR Tolkien.</p>

View example 2

The way in which <em> is handled by a Web browser will vary. A visual browser such as Internet Explorer will usually display the text in italics whereas an audio browser such as an in-car Web browser or a browser used by blind people may speak the word in a louder voice. Later on we will look at ways that you can specify how your elements should be displayed but for now we will let the browser decide for us.

Adding Strong Emphasis with <strong>

The <strong> element is similar to <em> except that it indicates a stronger emphasis. Let's alter the example above to give the text "JRR Tolkien" a strong emphasis.

<p><em>The Lord Of The Rings</em> was written by <strong>JRR Tolkien</strong>.</p>

View example 3

As with <em>, the way in which the <strong> element is handled depends on the browser being used. Visual browsers will usually display the text in bold, a speech browser may use a louder voice than it does for <em>.

Defining citations with <cite>

<cite> is used to indicate a citation or a reference to another source such as for further information. For example:

<p><cite>Homer Simpson</cite> said, Operator, give me the number for nine-one-one!.</p>

View example 4

In a visual browser the <cite> element will often be displayed in italics, an audio browser may inform the listener that this is a citation.

Abbreviations and Acronyms with <abbr> and <acronym>

In many fields today abbreviations and acronyms are common. But not everyone knows what they mean. Using the <abbr> and <acronym> elements enables you to provide their full meaning without cluttering your page.

Both the <abbr> and <acronym> elements work in the same way, and are interchangeable. There is no clear definition of the difference between an abbreviation and an acronym so use whichever you feel most suitable. I will talk about the <abbr> element but read this as meaning one or the other.

The <abbr> element uses an optional title attribute to show the full version of the abbreviation. For example:

<p>This document uses <abbr title="eXtensible Hyper Text Markup Language">XHTML</abbr>.</p>

View example 5

A visual browser will often alert a user that an explanation of an abbreviation is available; a tool-tip then appears when the user moves their mouse over the term. A speech browser may speak the full version of the abbreviation on request.

Please be aware that Internet Explorer does not support these elements up to version 6 on the PC. If you are using this browser then you will not see any visual difference in the examples above. However most other recent browsers, including Internet Explorer for the Macintosh, do support this element.

Quotes using <q> and <blockquote>

These elements are used to indicate text quoted from another source. <q> is an inline element (it does not break the text flow) and <blockquote> is a block element (it starts and finishes with a new line).

Let's start with <q>. <q> is used for short quotes that you want to include in a sentence or paragraph. <q> uses an optional cite attribute to indicate the location of a source for the quotation. For example:

<p>Homer Simpson said, <q cite="https://personal.inet.fi/taide/karjalainen/homer.html">Operator, give me the number for nine-one-one!</q>.</p>

View example 6

The cite attribute shows that the quote originally came from https://personal.inet.fi/taide/karjalainen/homer.html. Visual browsers should add quotation marks for you around the quoted text. Speech browsers may indicate that this is a quotation.

The <blockquote> element works in the same way as the <q> element except it is a block element so it starts and finishes with a new line. It is used for longer quotes:

<p>Homer Simpson said:</p>
<blockquote cite="https://personal.inet.fi/taide/karjalainen/homer.html">
The code of the schoolyard, Marge! The rules that teach a boy to be a man. Let's see. Don't tattle. Always make fun of those different from you. Never say anything, unless you're sure everyone feels exactly the same way you do. What else..</blockquote>

View example 7

Visual browsers display a <blockquote> with extra space on the right and left of the block (it is indented). Speech browsers may indicate that it is a quote. The cite attribute shows where the quote originally came from.

Computer Text with <code>, <samp>, <kbd> and <var>

These elements are used to indicate text that relates to a computer in a certain way, as follows:

<code>
indicates computer program code
<samp>
indicates sample output from a computer program
<kbd>
indicates text that a user of a program should enter
<var>
indicates a computer program variable or argument

If the above explanations mean nothing to you don't worry, if you don't know what they mean you're not likely to be using them in your documents, just remember that they exist.

Marking Document Changes with <ins> and <del>

Once you have released a document onto your Web site you may find that some information changes and you need to add or remove sections of text from your documents. While there is nothing to stop you from simply adding or removing text from your document, the <ins> and <del> elements can be used to mark added text and deleted text respectively.

For example, the following text has a section of each type of text:

<p>The code of the schoolyard, Marge! The rules that teach a boy to be a man. Let's see. <del>Don't tattle.</del> Always make fun of those different from you. <ins>Never say anything, unless you're sure everyone feels exactly the same way you do.</ins> What else..</p>

View example 8

Visual browsers will often underline <ins> elements and put a line through <del> elements. Speech browsers may indicate that the text has been added or removed respectively.

Using Elements for their Intended Purpose

As you viewed the examples in this section you may have thought of using the elements purely for their visual effect on the text. For example the <del> element above will often be displayed with a line through the marked text. You should not use any element purely for it's visual effect, later on we will be looking at style sheets which will give you full control over the way in which your text is displayed. Elements should only be used to mark text that has that meaning. This is called the semantics of your documents.

Summary

That's it for elements that are specific to certain types of text. Have a go at using them to create a document and get used to creating XHTML documents.

That's also the end of the first part of this article. I hope you enjoyed it. Part Two is now available.

A Selection of Other Articles from Our Collection

CSS Positioning Properties

Aimed at experienced CSS developers who need a reference for the properties related to positioning in CSS 2.

Technology & Innovation

Random Content Generator

A method of displaying random content on a web page using CSS, that is both search engine friendly and accessible.

Technology & Innovation

Search Engine Crawling

Explaining what is meant by a search engine "crawling" or "spidering" your web site.

Technology & Innovation