|
|
Agent Name Delivery
 The act of presenting one set of content to search engine spiders based on the name of that spider and another set of content to human web users.
This is done to present content that has been specifically optimized to rank well at each search engine while still presenting the same content to each
human visitor to the web site. This technology is easily detected as web surfers are able to use an agent name faking program to appear as if they are the
named spider and view the cloaked content.
 The process of sending search engine spiders to a tailored page, yet directing your visitors to what you want them to see. This is done using server
side includes (or other dynamic content techniques). SSI, for example, can be used to deliver different content to the client depending on the value of
HTTP_USER_AGENT. Most normal browser software packages have a user agent string which starts with "Mozilla" (coined from Mosaic and Godzilla).
Most search engine spiders have specific agent names, such as "Gulliver", "Infoseek sidewinder", "Lycos spider" and "Scooter". By switching on the value
of HTTP_USER_AGENT (a process known as agent detection), different pages can be presented at the same URL, so that normal visitors will never see
the page submitted to search engines (and vice versa).
Browser
 The name of the web browser used to make the request. This is derived from the agent string, and suffers some of the same "lying" issues that it does
(see agent, above). Summary decodes the all of the standard methods of partially hiding the identity of the browser.
 A browser is a user agent, a software program used to access web content. There are many types of browsers. The most common and familiar are
the graphical browsers, Internet Explorer and Netscape. They "translate" HTML-encoded files into the text, images, sounds, and other features that
may be present on a website. Other browsers may display "text-only" content or audibly "read" a page.
Cache
 A temporary storage area that a web browser or service provider uses to store common pages and graphics that have been recently opened.
The cache enables the browser to quickly reload pages and images that were recently viewed. The proper pronunciation of cache is "cash".
Click Through
 The act of a visitor clicking on a link displayed within a set of search engine results in order to reach the web page represented by that link.
Click thru amounts related to each keyword search can be tracked as a method of determining if a particular keyword will entice visitors to a web site.
 The process of clicking on a link in a search engine output page to visit an indexed site. This is an important link in the process of receiving visitors
to a site via search engines. Good ranking may be useless if visitors do not click on the link which leads to the indexed site. The secret here is to
provide a good descriptive title and an accurate and interesting description.
Crawler
 An automated robot program that follows links to visit web sites on behalf of search engines or directories. Crawlers then process and index the
code and content of a web page to be stored in the search engine's database. For example, Googlebot is the crawler that travels the web finding and
indexing pages for the Google search engine.
Domain
 A sub-set of internet addresses. Domains are hierarchical, and lower-level domains often refer to particular web sites within a top-level domain.
The most significant part of the address comes at the end - typical top-level domains (TLD) are .com, .edu, .gov, .org (which sub-divide addresses
into areas of use). There are also various geographic TLDs (e.g. .de, .ca, .fr, .bg etc.) referring to particular countries. The relevance to search engine
terminology is that web sites which have their own domain name will often achieve better positioning than web sites which exist as a sub-directory of
another organization's domain.
HTML (Hyper Text Markup Language)
 An acronym for Hyper Text Markup Language. HTML is the authoring language that is used to create documents on the World Wide Web.
HTTP (Hyper Text Transfer Protocol)
 An acronym for Hyper Text Transfer Protocol. HTTP is a formal communication method that transmits requests and data between user agents or
web browsers and Web servers. when you enter a URL in your browser, this actually sends an HTTP command to the Web server directing it to
fetch and transmit the requested Web page.
 HTTP is the protocol used for information exchange on the WWW. HTTP defines how messages are formatted and transmitted, and what actions
a HTTP Server and an HTTP Client (which in most cases is a Browser) should take in response to various messages. HTTP uses a reliable,
connection-oriented transport service such as the TCP. HTTP is a stateless Protocol, where each request is interpreted independently, without any
knowledge of the requests that came before it.
Hit
 A single request is often called a "hit" on the web site. Saying there were "56 hits" on an item means that there were 56 separate requests for that
item. The item may be a specific file, a particular referrer, or some other use of a resource by a single request. Summary uses the term "hits" to denote
the number of times some event occurred.
 In the context of visitors to web pages, a hit (or site hit) is a single access request made to the server for either a text file or a graphic. If, for
example, a web page contains ten buttons constructed from separate images, a single visit from someone using a web browser with graphics switched
on (a "page view") will involve eleven hits on the server. (Often the accesses will not get as far as your server because the page will have been cached
by a local internet service provider). In the context of a search engine query, a hit is a measure of the number of web pages matching a query
returned by a search engine or directory.
Host
 A computer is often referred to as a host when talking about networks. Each computer is assigned a unique IP address. There are some exceptions,
where several computers will share a single IP address, or one computer can have several IP addresses. In general, each unique IP address is referred
to as a host.
IP address
 Unique numerical identifier given to each Internet connection. The IP address is how data finds its way from a web site back to your computer.
IP addresses that are attached to dialup ISP accounts usually dynamic and change with each connection. IP addresses that are attached to a
permanent Internet connection like a T1 line or a cable modem are static and stay the same all the time.
IP delivery
 The act of presenting one set of content to search engine spiders and another set of content to human web users. This is accomplished by
presenting different sets of content based on the IP address of a visitor. IP Delivery is a form of cloaking that is used to present content that has been
specifically optimized to rank well at each search engine while still presenting the same content to each human visitor to the web site. This technology
is difficult to detect, as it requires that a user present the IP address of a search engine spider in order to view the hidden web site content.
 Similar to agent name delivery, this technique presents different content depending on the IP address of the client. It is very difficult to view
pages hidden using this technique, because the real page is only visible if your IP address is the same as (for example) a search engine's spider.
Method
 Each request contains a method. The most common method is "GET", which means simply get the requested item. A "HEAD" request means to get
information about the item, such as size and last date modified. A browser will often keep copies of items in their cache and then use a "HEAD" method
to check if the item has been modified since it was put in the cache.
Optimization
 The changes that are made to the content and code of a web site in order to increase it's rankings in the results pages of search engines and
directories. These changes may involve rewriting body copy, altering Title or Meta tags, removal of Frames or Flash content, and the seeking of
incoming links.
 Changes made to a web page to improve the positioning of that page with one or more search engines. A means of helping potential customers
or visitors to find a web site. Optimization may involve design/layout changes, new text for the title-tags, meta-tags, alt- attributes, headings, and
changes to the first 200-250 words of the main text. A large image map at the top of a page should be moved further down the page. Frames should
be avoided (unless navigational links are also provided within the frames).
PPC engine (Pay Per Click Engine)
 A search engine that allows webmasters to purchase their positions within the search results based on the amount of money they are willing to pay
for each click thru their site's listing receives.
Page Popularity
 A measure of the number and quality of links to a particular page (inbound links). Many search engines (and most noticeably Infoseek) are
increasingly using this number as part of the positioning process. The number and quality of inbound links is becoming as important as the
optimization
of page content.
Protocol
 Agreed-upon methods of communications used by computers. A specification that describes the rules and procedures that products should follow to
perform activities on a network, such as transmitting data. If they use the same protocols, products from different vendors should be able to
communicate on the same network.
 The protocol used by the visitors browser when requesting content from your server. Most commonly HTTP/1.1, or HTTP/1.0 (indicating different
revisions of the HTTP specification).
Proxy (HTTP Proxy)
 In the context of the WWW, an HTTP Proxy is an intermediary program which acts as both an HTTP Server and an HTTP Client, receiving a reques
t from a Client (in most cases a Browser) and then acting as an HTTP Client and making requests on behalf of other HTTP Clients. However, requests
to an HTTP Proxy can also be serviced internally, for example if the HTTP Proxy uses its Cache instead of sending a request to the origin HTTP Server.
In order to use an HTTP Proxy, the HTTP Client's request has to be explicitly addressed to the HTTP Proxy, which then sends a request to the origin
HTTP Server.
 A gateway that relays one Internet session to another.
 An intermediary program that acts as both a server and a client for the purpose of making requests on behalf of other clients. Proxies are often
used as client-side portals (i.e., a trusted agent that can access the Internet on the client's behalf) through the network firewall and as helper applications
for handling requests via protocols not implemented by the user agent.
 A software agent, often a firewall mechanism, which performs a function or operation on behalf of another application or system while hiding the
details involved.
 An intermediate server that sits between the client and the origin server. It accepts requests from clients, transmits those requests on to the origin
server, and then returns the response from the origin server to the client. If several clients request the same content, the proxy can deliver that content
from its cache, rather than requesting it from the origin server each time, thereby reducing response time.
Ranking
 The placement of a web site within a particular search engines results pages.
 The process of ordering web sites or web pages by a search engine or a directory so that the most relevant sites appear first in the search results for
a particular query. There is a lot of software that can be used to determine how a URL is positioned for a particular search engine when using a particular
search phrase.
Referrer (HTTP REFERER)
 Web site page and location showing how your visitor clicked through to your site
 The referring URL of the current request (the page containing the link the user clicked).
 The web browser generally provides the most recent previous URL when making a request; this is called the referrer. There are two major kinds of
referrers. Each graphic on a page will show that page as its referrer. When a visitor clicks on a link that points to a page at your site, the URL of the
external page containing the link is sent as the referrer.
 The URL of the web page from which a visitor came. The server's referrer log file will indicate this. If a visitor came directly from a search engine
listing, the query used to find the page will usually be encoded in the referrer URL, making it easy to see which keywords are bringing visitors.
The referrer information can also be accessed as document referrer within JavaScript or via the HTTP_REFERER environment variable (accessible from
scripting languages).
 Referer (HTTP REFERER) is a misspelling of "referrer" which somehow made it into the HTTP standard. A given web page's referer is the URL of
whatever web page contains the link that the user followed to the current page. Most browsers pass this information as part of a request.
 An URL a visitor originated from to get you your site that contains a link to your site. If the visitor followed a link to reach one of your pages, the
referrer will be the previous page. In the case of a graphic on a page, the referrer will be the page containing the graphic.
Relevancy Algorithm
 The method a search engine or directory uses to match the keywords in a query with the content of each web page, so that the web pages found
can be ordered suitably in the query results. Each search engine or directory is likely to use a different algorithm, and to change or improve its algorithm
from time to time.
Request (HTTP Request)
 When you type a URL into a web browser, it sends a request for the item named by that URL to the server. Request can mean the entire request or
specifically the name of the item contained in the request.
SEM (Search Engine Marketing)
 The changes that are made to the content and code of a web site in order to increase its rankings in the results pages of search engines and directories.
These changes may involve rewriting body copy, altering Title or Meta tags, removal of Frames or Flash content, and the seeking of incoming links.
Search Engine Marketing also entails non-optimization methods of drawing traffic through search engines, including management of paid advertising
listings on search engines.
SEO (Search Engine Optimization)
 The changes that are made to the content and code of a web site in order to increase its rankings in the results pages of search engines and directories.
These changes may involve rewriting body copy, altering Title or Meta tags, removal of Frames or Flash content, and the seeking of incoming links.
SEP (Search Engine Placement)
 The changes that are made to the content and code of a web site in order to increase its rankings in the results pages of search engines and directories.
These changes may involve rewriting body copy, altering Title or Meta tags, removal of Frames or Flash content, and the seeking of incoming links.
SSL (Secure Sockets Layer)
 A session layer protocol that provides authentication and confidentiality to applications.
Search Engine
 A searchable index of web sites that is traditionally compiled by a spider that visits web pages and stores the information from each page in a database.
 A search engine most commonly refers to an application designed to find requested information on the Internet. Search engines can be used to find just
about anything indexed on the Internet. Search engines can locate names, images, music, documents, news and much more. Search engines are essential
research tools.
 A server or a collection of servers dedicated to indexing internet web pages, storing the results and returning lists of pages which match particular
queries. The indexes are normally generated using spiders. Some of the major search engines are Altavista, Excite, Hotbot, Infoseek, Lycos, Northern
Light and Webcrawler. Note that Yahoo is a directory, not a search engine. The term Search Engine is also often used to describe both directories and
search engines.
Search Indexer
 A web robot used by a search engine to index your web page content so that they can include your page in their search engine database. Note that
many different search engines might use the same search indexer and that some search engines include results from several different search indexers.
Spider (Spyder)
 An automated program that follows links to visit web sites on behalf of search engines or directories. Robots then process and index the code and
content of a web page to be stored in the search engine's database.
 That part of a search engine which surfs the web, storing the URLs and indexing the keywords and text of each page it finds.
TCP/IP
 Transmission Control Protocol/Internetwork Protocol. The suite of protocols the Internet is based on.
TLD (Top Level Domain)
 TLDs are the names at the top of the DNS naming hierarchy. They appear in domain names as the string of letters following the last (rightmost) ".",
such as "com" in "google.com". The administrator for a TLD controls what second-level names are recognized in that TLD. The administrators of the "root
domain" or "root zone" control what TLDs are recognized by the DNS. Commonly used TLDs include .com, .net, .edu, .org, .gov, .co.uk etc.
URI (Uniform Resource Identifier)
 The WWW is considered to include objects accessed using an extendable number of Protocols, existing, invented for the WWW itself, or to be
invented in the future. Access instructions for an individual object under a given Protocol are encoded into forms of address string. Other Protocols
allow the use of object names of various forms. In order to abstract the idea of a generic object, the WWW needs the concepts of the universal set of
objects, and of the universal set of names or addresses of objects. A URI is a member of this universal set of names in registered name spaces and
addresses referring to registered Protocols or name spaces. A URL is a form of URI which expresses an address mapping onto an access algorithm using
network Protocols. A URN is a form of URI which uses a name space (and associated Resolution Protocols) for persistent object names.
URL (Uniform Resource Locator)
 A URL is the address of a resource which is retrievable using Protocols already deployed on the Internet. A URL defines an access Protocol, called a
"scheme", and a "scheme-dependent part", which has to provide sufficient information to locate an object using the specified scheme. In case of HTTP
URLs, the scheme is "http", and the scheme-dependent part specifies the name of the HTTP Server as well as the path of the object on the HTTP Server.
 Uniform Resource Locator is the term applied to Internet addresses. The acronym formed by the letters "URL" may be pronounced phonetically
as "earl" or by individual letter. URLs typically have four parts: protocol type (http), host domain name (www.google.com), directory path (/), and file
name (about.html).
 Universal Resource Locator. An address which can specify any internet resource uniquely. The beginning of the address indicates the type of
resource - e.g. http: for web pages, ftp: for file transfers, telnet: for computer login sessions or mailto: for e-mail addresses.
Unique Hosts
 The number of distinct IP addresses and host names making requests. This may be used as a rough estimate of the number of distinct people accessing
your site, even though it does not exactly correspond to people. There are two major reasons why this number does not directly count people, and some
other minor ones. Some accesses are made through proxy servers or NAT gateways, machines that have a single IP address but may be in use by multiple
people. AOL and some of the other large service providers always route requests through proxy servers. Dial-up connections usually have a different IP
address each time you dial-up, so a single person accessing your site over the course of several different dial-up sessions will have several different IP
addresses.
Unique Visitor
 A unique visitor is a host that has made at least 1 hit on 1 page of your web site during the current period shown by the report. If this host make
several visits during this period, it is counted only once.
 A real visitor to a web site. Web servers record the IP addresses of each visitor, and this is used to determine the number of real people who have
visited a web site. If for example, someone visits twenty pages within a web site, the server will count only one unique visitor (because the page accesses
are all associated with the same IP address) but twenty page.
User-Agent
 A piece of software, such as a browser or spider, that is interprets the content on a web server and presents it to the user as a web page. Examples
include Internet Explorer, Opera, Netscape and various search engine spiders.
 A piece of software acting as an "agent" on behalf the visitor making the request. There is a standard way for that software to tell the web server its
name, version number, and possibly other information.
 A user agent is a generic term for any program used for accessing a website. This includes graphical browsers (such as Internet Explorer, Netscape or
Opera), robots and spiders, and any other software program that acts as an "agent" for a someone or something accessing Web content.
 A user agent is a software program that can send requests to a web server and receive responses to those requests. This is precisely what a browser
does. But there are also automatic programs known as robots that are user agents. Further, web caching servers can send requests to a web server and
receive responses to those requests. They generally do this on behalf of other user agents, but could in some cases do it on their own behalf (for example
during pre-emptive caching) in which case they would behave like user agents. Seen from the perspective of the web server, all user agents look the
same. It is not immediately obvious to the server if it is being visited by a human-driven browser or by an automated retrieval system. So we refer to all of
them as user agents.
 A User Agent is any device that interprets HTML (or other web-) documents. The most commonly used User Agents are presently Web browsers on
computer screens. However, apart from other types of visual user agents (such as PDA screens, projectors and more) there are also non-visual User
Agents, such as search robots, speech synthesizer and Braille readers. To indicate this wide variety of media that HTML caters for, it is thus more
appropriate to talk of a User Agent than of a browser.
 A user agent is a generic term for any program used for accessing a website. This includes graphical browsers (such as Internet Explorer, Netscape
or Opera), robots and spiders, and any other software program that acts as an "agent" for a someone or something accessing Web content.
 The user agent is the client application that requests a document from an HTTP server. When the client sends a request to an HTTP server, it typically
sends the name of the user agent with the request header so that the server can determine the capabilities of the client software.
Visits
 Number of visits made by all visitors. Think "session" here, say a unique IP accesses a page, and then requests three others without an hour between
any of the requests, all of the "pages" are included in the visit, therefore you should expect multiple pages per visit and multiple visits per unique visitor
(assuming that some of the unique IPs are logged with more than an hour between requests).
 A sequence of requests, all made from the same IP address, having the same agent string, and with no gap between requests of more than 30 minutes.
The time limit is configurable; by default it is 30 minutes. A visit normally corresponds to a single person moving through your web site, although there
can be exceptions. A proxy machine used by several people could result in two different people accessing the site from the same IP address, with the
same agent string, within the time limit. It is also possible for a single person to make different requests to your site from multiple IP addresses at the same
time. Both of these exceptions are rare, generally accounting for a small portion of all visits. Very high traffic sites tend to experience these situations more
often.
WWW (World Wide Web)
 The World Wide Web, or simply the Web, refers to a system of Internet servers that support documents formatted using HTML. These documents or
webpages are served to any of the various web browsers using HTTP. Web pages may contain graphics, movies, sound files or other hypermedia.
Pages or content may be linked to other pages or content using hyperlinks. You are navigating the Web when you follow hyperlinks.
Web Robot
 An automated program that follows links to visit web sites on behalf of search engines or directories. Web robots then process and index the code
and content of a web page to be stored in the search engine's database.
 A program making a request that is not in direct response to a person making a request of that program is thought of as a Web Robot. Web robots
are used for several purposes, such as search engine indexing robots, link checkers, e-mail address extractors, and update watchers. Summary has an
internal database of common known Web Robots, determined from the agent string. Any host making a request for "robots.txt" is counted as a possible
web robot. The "robots.txt" file is frequently used by robots to know which portions of your site should be avoided by robots.
 Any browser program which follows hypertext links and accesses web pages but is not directly under human control. Examples are the search engine
spiders, the "harvesting" programs which extract e-mail addresses and other data from web pages and various intelligent web searching programs.
|
|
|
|
|