The Components of a URL – What Makes up a Web Address
Digital marketers, particularly those in client side roles, are often so busy doing the day job that learning about websites; how they work and what makes them tick is challenging. One of the concepts of TakeItOffline is that we give you the ability to understand and question aspects that are often overlooked in training.
The URL:
http://www.TakeItOffline.co.uk/Directory/page.htm?parameter1=value¶meter2=value#fragment
Protocol:
Typically this is either http or https, the difference is that one is encrypted so it protects the users data sent between the browser and the website. Google and other companies are increasingly suggesting that everything should be https – often referred to as SSL or TLS.
Subdomain:
An example of a subdomain is ‘www’, and can often be excluded. Without the‘www’, the site sits at the root. Sometimes people use the subdomain for different language variants or when targeting other countries, for example uk.takeitoffline.co.uk, fr.takeitoffline.co.uk, etc. Another common practice is for the blog to sit at a subdomain, blog.takeitoffline.co.uk.
Domain:
This is everything before that first forward slash, ignoring the subdomains. Examples of this are bbc.co.uk and bbc.com. These are purchased (or rented) from domain registrars.
TLD:
TLD stands for Top level domain, this is the .co.uk or .com which is after the domain name. These typically refer to a country or an institution. Recently a whole array of TLDs have been introduced and are available to purchase as part of your domain name such as .agency.
A good example is https://salt.agency where the TLD is ‘agency’. These are called generic TLDs (gTLDs) – originally .com and various others were intending to be gTLDs.
It is worth noting that some domains are selling their subdomains as pseudo domains, a good example is anything ending in .uk.com which is technically a subdomain of .com. You don’t need to worry about this unless you are buying a domain.
When purchasing a domain name, try to buy a TLD for the country you are targeting or a .com which is more global. It is rare that I would recommend anything else (although tech startups are often opting for the geeky .io, such as kraken.io, this TLD is really for the British Indian Ocean).
Directories:
The next part is the directory, or subfolders – this is more challenging to explain as classically websites would have a url structure that typically ended in a page, but today it is common practice (and with good reasons) to just end here.
Consistent linking is an important factor to apply to all URLs, whether it is one/two/three/ levels deep, as a trailing slash could make a negative difference in some setups.
Originally websites were built from the same as files in your computer but now elements urls can be re-written from single files. Directories can be case sensitive, which means that you should use consistent case when linking to them (I would recommend always lower case).
Read over the “An introduction to information architecture” article for more information
Pages:
As mentioned above, pages are now less common – typically pages no longer end with an extension such as .html or .asp (or .php) but if you spot an extension (a . something) then this part, before it – is the page. Like directories, this is case sensitive so be consistent and again I would recommend lower case.
Parameters:
When webpages first became dynamic, these were often used to modify the data called from a database and are still commonly used for this purpose (filtering, sorting etc..) but less frequently for separate pages of content. They can cause issues for search engines so we try to be very controlled in their usage and how we manage them.
Parameters come with a parameter name and value and can be case sensitive, they follow a question mark. If you have multiple parameters, these are separated by ampersands. Common use for parameters are tracking parameters (such as Googles for separating channels and campaigns); tracking parameters don’t usually modify the data but are picked up by other tools.
Path:
the path is the combination of the directories and the page – anything after that first slash but excludes parameters and fragments –
Fragment:
So finally the last part is the fragment, this can be setup to ‘jump’ to a specific location in a document, fire some JavaScript or to add tracking. One aspect of fragments is that they aren’t passed to the server, which means that they can speed up a site which caches (remembers) popular pages. Instead of using conventional parameters, you can use fragments as parameters sometimes which can speed up the server.
Couple of examples:
Example 1
https://test.co/?goto=123
- Protocol is https –
- No subdomain & no path
- TLD is .co which is actually Columbia but many people are using them if they can’t find the .com
- Parameter is called goto and the value is 123.
Often URLs that look like these redirect somewhere else and are just used for tracking
Example 2
http://www.hello.uk.com/welcome.php
- So this one the is on http,
- the subdomain is actually “www.hello” which is trickier I know, but because uk.com isn’t a top level domain, the domain is actually .uk.com
IP addresses:
Historically a domain name was just a way of pointing to a specific IP addresses. As the number of these started to dry up (too many users, too many devices) one IP could often have many websites on it. This is beginning to change with IPv6 which increases the number of IPs available.
IP addresses are made up of a set of numbers, separated by a period like this – 192.168.1.1 .It is unusual to see public content shared on IP addresses.
Anything left?
Typically there isn’t anything left you should come across in the wild, sometimes you will find that urls don’t actually exist for normal users and are only setup for internal viewing.
But if you find something that doesn’t match the above, have any questions, please drop a comment below.