What is URL (Uniform Resource Locator)?

URL (Uniform Resource Locator) is a string of characters assigned a unique address to each information source on the Internet.

URL (Uniform Resource Locator) Definition

What is a URL?

The URL stands for Uniform Resource Locator and has a unique address for each page of documents on the World Wide Web, all Gopher items, and all USENET discussion groups.

Historical

Uniform resource finders were a fundamental innovation in Internet history. It was first used by Tim Berners-Lee in 1991 to enable document authors to create hyperlinks on the World Wide Web.

Since 1994, in the Internet standards, the concept of URL has been included in the more general URI, but the term URL is still widely used.

Although they are never mentioned in any standard in this way, many believe that the first URLs mean a universal resource finder.

This interpretation may be due to the fact that the URI of the URI is universally expressed before the release of RFC 2396, although the U in the URL is always uniform.

The URL of an information source is the Internet address that allows the browser to find the source and display it accordingly.

Therefore, it combines the name of the computer that provides the information, the directory where it is located, the name of the file, and the protocol to be used to retrieve the data.

Description

The general format: scheme://machine/directory/file

Other data can also be added: scheme://user:password@machine:port/directory/file

Scheme

A URL is usually classified according to its scheme, which shows the network protocol used to get the information of the resource identified over the network. The URL begins with the schema name, followed by a colon and a specific section of the scheme.

Some Examples of URL Schemes

HTTP (HyperText Transfer Protocol) is the protocol used to transmit HyperText. All HTML pages on WWW servers should be referenced through this service. It indicates that there is a connection to the WWW Server.

HTTPS (HyperText Transfer Protocol Secure) is the protocol used to connect to secure WWW servers. These servers are typically commercially extensive and use encryption to prevent the capture of data sent, usually credit card numbers and personal data, and will connect to a secure WWW server.

FTP (File Transfer Protocol) will be used when the information to be accessed is on an FTP server. By default, an anonymous server is accessed (anonymous). If you want to specify the user name, it is used as follows: FTP: //user.password@machine, and then it will ask for the access key.

Mailto is used to send emails, but not all browsers use it. In this case, only the target email address is specified: mailto: //alias.mail@machinename

LDAP looks for the Lightweight Directory Access Protocol.

Telnet is used to access general accounts such as remote terminal emulation and library accounts to connect to a multi-user machine. The usual thing to do is to call an external application that provides the connection. In this case, the machine and the entry will be specified.

Along with some of the popular schemes such as email, HTTP, FTP, and file, and the general URL syntax, the Request for Comments was first announced in 1994 in RFC 1630, then more specifically, RFC 1738 and RFC 1808.

Some of the schemes described in the first RFC are still valid. Others are discussed or refined by later standards.

Meanwhile, the definition of the generic URL syntax is divided into two separate URI specification lines: RFC 2396 (1998) and RFC 2732 (1999). The current standard is STD 66 and RFC 3986 (2005).

General URL Syntax

All URLs must follow a general syntax, regardless of the schema. Each schema can determine its syntax requirements for its specific part, but the full URL should follow the general syntax.

Using a limited character set compatible with ASCII’s printable subset, the general syntax allows URLs to represent the address of a resource, regardless of the original format of the address components.

Schemes using typical link-based protocols use a common syntax for generic URIs: schema://authority/path?Query#part

Authorization usually includes the name or IP Address of a Server, sometimes followed by a colon (:) and a TCP Port number. You can also add a username and password to verify yourself on the server.

The path is a specification of a location in some hierarchical structure that uses a slash (/) as the delimiter between components.

The query usually specifies the parameters of a dynamic query for some database or processes built on the server.

A part identifies a portion of a resource, usually a location in a document.

Uppercase/Lowercase Sensitivity

According to the current standard, schema and host components are not case-sensitive and must be lowercase when normalized during processing. It should be assumed that there is differentiation in other elements.

However, in practice, in different components other than the protocol and the host, this differentiation depends on the web server and the operating system of the host hosting the server.

URL in Daily Use

An HTTP URL combines four essential pieces of information needed to retrieve a resource from anywhere on the Internet at a simple address:

  1. The protocol used to communicate.
  2. The host (server) with which you communicate.
  3. The network port on the server to connect to.
  4. The path to the resource on the server (for example, the file name).

Most web browsers do not require the user to enter “http://” to go to the web page, as HTTP is the most common protocol used in web browsers.

Similarly, since 80 is the default port for HTTP, it is not usually specified.

Because the HTTP protocol allows a server to respond to a request by pointing the web browser to a different URL, many servers also allow users to bypass certain parts of the URL, such as www.

However, these shortcomings create a technically different address, so the web browser cannot make these adjustments, and you have to rely on the server to respond with a redirect. A web server can provide two different pages for URLs that differ only in a # character.

Add a Comment

Your email address will not be published. Required fields are marked *