Uniform Resource Locator
Definition
A Uniform Resource Locator (URL), colloquially termed a web address, is a reference to a web resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identifier (URI), although many people use the two terms interchangeably. URLs are used to retrieve web pages (HTML documents), images, videos, style sheets, JavaScript, and other web resources. They are a fundamental aspect of web browsing, providing a simple and easy-to-understand way to access resources on the web.
Components
A URL consists of several components, some of which are mandatory while others are optional. The main components of a URL include the scheme, authority, path, query, and fragment.
Scheme
The scheme, also known as the protocol, identifies the method used to access the resource. Common schemes include HTTP (Hypertext Transfer Protocol), HTTPS (HTTP Secure), FTP (File Transfer Protocol), and mailto (for email addresses).
Authority
The authority component of a URL contains the domain name or IP address of the server where the resource is located. It may also include a port number if a specific port is to be used for the connection.
Path
The path component of a URL specifies the specific resource on the server. In the URL "http://www.example.com/articles/123", "/articles/123" is the path.
Query
The query component of a URL, preceded by a question mark (?), provides additional parameters that the server can use to customize the response. For example, in the URL "http://www.example.com/search?q=example", "q=example" is the query.
Fragment
The fragment component of a URL, preceded by a hash sign (#), specifies a location within the resource. For example, in the URL "http://www.example.com/articles/123#section1", "section1" is the fragment.
Syntax
The syntax of a URL is defined by the RFC 3986 specification, which is maintained by the Internet Engineering Task Force (IETF). According to this specification, a URL must follow a specific format that includes the scheme, authority, path, query, and fragment components.
Encoding
URLs can only be sent over the Internet using the ASCII character-set. Therefore, the URL has to be converted into a valid ASCII format. URL encoding replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits corresponding to the character values in the ISO-8859-1 character-set.
Relative and Absolute URLs
URLs can be classified as either absolute or relative. An absolute URL contains all the information necessary to locate a resource. A relative URL, on the other hand, contains only enough information to locate a resource relative to another URL.
Uses
URLs are used in many contexts, the most common of which is web browsing. However, they are also used in many other contexts, such as XML namespaces, CSS @import rules, and various command line tools.
Security
The use of URLs also introduces several security considerations. For example, URLs can be manipulated to conduct phishing attacks, to inject malicious scripts, or to exploit vulnerabilities in web applications.