Uniform Resource Locator

From Canonica AI

Definition

A Uniform Resource Locator (URL), colloquially termed a web address, is a reference to a web resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identifier (URI), although many people use the two terms interchangeably. URLs are used to retrieve web pages (HTML documents), images, videos, style sheets, JavaScript, and other web resources. They are a fundamental aspect of web browsing, providing a simple and easy-to-understand way to access resources on the web.

Components

A URL consists of several components, some of which are mandatory while others are optional. The main components of a URL include the scheme, authority, path, query, and fragment.

Scheme

The scheme, also known as the protocol, identifies the method used to access the resource. Common schemes include HTTP (Hypertext Transfer Protocol), HTTPS (HTTP Secure), FTP (File Transfer Protocol), and mailto (for email addresses).

Authority

The authority component of a URL contains the domain name or IP address of the server where the resource is located. It may also include a port number if a specific port is to be used for the connection.

Path

The path component of a URL specifies the specific resource on the server. In the URL "http://www.example.com/articles/123", "/articles/123" is the path.

Query

The query component of a URL, preceded by a question mark (?), provides additional parameters that the server can use to customize the response. For example, in the URL "http://www.example.com/search?q=example", "q=example" is the query.

Fragment

The fragment component of a URL, preceded by a hash sign (#), specifies a location within the resource. For example, in the URL "http://www.example.com/articles/123#section1", "section1" is the fragment.

Syntax

The syntax of a URL is defined by the RFC 3986 specification, which is maintained by the Internet Engineering Task Force (IETF). According to this specification, a URL must follow a specific format that includes the scheme, authority, path, query, and fragment components.

Encoding

URLs can only be sent over the Internet using the ASCII character-set. Therefore, the URL has to be converted into a valid ASCII format. URL encoding replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits corresponding to the character values in the ISO-8859-1 character-set.

Relative and Absolute URLs

URLs can be classified as either absolute or relative. An absolute URL contains all the information necessary to locate a resource. A relative URL, on the other hand, contains only enough information to locate a resource relative to another URL.

Uses

URLs are used in many contexts, the most common of which is web browsing. However, they are also used in many other contexts, such as XML namespaces, CSS @import rules, and various command line tools.

Security

The use of URLs also introduces several security considerations. For example, URLs can be manipulated to conduct phishing attacks, to inject malicious scripts, or to exploit vulnerabilities in web applications.

See Also

A close-up shot of a URL typed in the address bar of a web browser.
A close-up shot of a URL typed in the address bar of a web browser.