Uniform Resource Locator (URL):
A Uniform Resource
Locator (URL), also termed as a web address, is
a reference to a web resource that specifies its location on a computer
network and a mechanism for retrieving it. A URL is a specific type
of Uniform Resource Identifier (URI), although many people use
the two terms interchangeably. URLs occur most commonly to
reference web pages (http), but are also used for file transfer
(ftp), email (mailto), database access (JDBC), and many other applications.
Most web
browsers display the URL of a web page above the page in an address
bar. A typical URL could have the form http://www.example.com/index.html,
which indicates a protocol (http), a hostname (www.example.com), and
a file name (index.html).
Uniform Resource Locators were
defined in RFC 1738 in 1994 by Tim Berners-Lee, the
inventor of the World Wide Web, and the URI working group of
the Internet Engineering Task Force (IETF), as an outcome of
collaboration started at the IETF Living Documents birds of a
feather session in 1992.
Every HTTP URL conforms to the
syntax of a generic URI. The URI generic syntax consists of a
hierarchical sequence of five components:
URI = scheme:[//authority]path[?query][#fragment]
where the authority component
divides into three subcomponents:
authority = [userinfo@]host[:port]
This is represented in
a syntax diagram as:
The URI comprises:
·
A
non-empty scheme component followed by a colon (:), consisting of a sequence of
characters beginning with a letter and followed by any combination of letters,
digits, plus (+), period (.),
or hyphen (-). Although schemes are
case-insensitive, the canonical form is lowercase and documents that specify schemes
must do so with lowercase letters. Examples of popular schemes include http, https, ftp, mailto, file, data,
and irc. URI schemes should be registered
with the Internet Assigned Numbers Authority (IANA), although
non-registered schemes are used in practice.
·
An
optional authority component preceded by two slashes (//), comprising:
o
An optional userinfo subcomponent that may consist of
a user name and an optional password preceded by a colon (:), followed by an at symbol (@).
o
A host subcomponent, consisting of either a registered
name or an IP address. IPv4 addresses must be
in dot-decimal notation, and IPv6 addresses must be enclosed in
brackets ([]).
o
An
optional port subcomponent preceded by a colon (:).
·
A path component,
consisting of a sequence of path segments separated by a slash (/). A path is always defined for a
URI, though the defined path may be empty (zero length). A segment may also be
empty, resulting in two consecutive slashes (//)
in the path component. A path component may resemble or map exactly to
a file system path, but does not always imply a relation to one.
·
An
optional query component preceded by a question mark (?), containing a query
string of non-hierarchical data.
·
An
optional fragment component preceded by a hash (#). The fragment contains
a fragment identifier providing direction to a secondary resource,
such as a section heading in an article identified by the remainder of the URI.
A web browser will usually dereference a URL by performing an HTTP request
to the specified host, by default on port number 80. URLs using the https scheme require that
requests and responses be made over a secure connection to the website.
No comments:
Post a Comment