Skip to content

LRU prefixes

Benjamin Ooghe-Tabanou edited this page Dec 21, 2012 · 1 revision

A LRU prefix is, technically, a LRU. The difference is that a LRU prefix does not necessary come from a real URL. It might be just a part of other LRUs. It is used to define a Web entity.

Examples:

  1. .com websites
    • "www.skyrock.com" has for LRU "l:http|h:com|h:skyrock|h:www". There is a page associated to this LRU.
    • But the LRU "l:http|h:com" has no valid URL associated to it. There is no page associated. Nevertheless this LRU is a prefix of "www.skyrock.com" as well as any page or website in ".com". It is useful to get all of these at once.
  2. Domains and subdomains
    • The CNRS domain hosts a lot of websites: "www.cnrs.fr" (the main site) as well as "inl.cnrs.fr", "www.ipmc.cnrs.fr" and many others (and even sub-sub-sites like "heliquest.ipmc.cnrs.fr/").
    • But strangely, the "cnrs.fr" URL (without "www") does not lead to any page. The LRU "s:http|h:fr|h:cnrs" might have a page, but has not, even if it exists as a URL. It is nevertheless useful if you have to agregate all the CNRS galaxy as a single entity.
  3. Forbidden pages
    • This very long URL leads to some page: "www.ipmc.cnrs.fr/cgi-bin/standard.cgi?descriptif=admin_accueil.txt&dossier1=presentation&dossier2=admin_accueil&lang=fr".
    • The shorter URL "www.ipmc.cnrs.fr/cgi-bin/" leads to a forbidden page. The "/cgi-bin/" part of the URL is there for technical reasons. The LRU "s:http|h:fr|h:cnrs|h:ipmc|h:www|p:cgi-bin" might still be used as LRU prefix to define this part of the website.