Showing posts with label how_tos. Show all posts
Showing posts with label how_tos. Show all posts

Reviving Dead Web Pages

Learing how to retrive web sites that no longer exist from a search engines cache may turn out to be one of the most important skills you learn. Being able to revive a web page that might still be cached by a search engine can help you bridge potentially important gaps in information and research.

One way to do this and the easiest by far is using the Way Back Machine at the Internet Archives (link)

However, the Way Back Machine *does not* have a copy of every web page ever uploaded to the web. Some site owners specifically request that thier sites not be crawled for inclusion in the archives. While other webmasters will include small lines of code the block the crawlers (better known as robots.) So once that "blocked" page disappears it's gone forever.

But you still might be able to find other pages that are just as important and here's how you do that:



From The Google help center (link):


Advanced Operators

Google supports several advanced operators, which are query words that have special meaning to Google. Typically these operators modify the search in some way, or even tell Google to do a totally different type of search. For instance, "link:" is a special operator, and the query [link:www.google.com] doesn't do a normal search but instead finds all web pages that have links to www.google.com.

Several of the more common operators use punctuation instead of words, or do not require a colon. Among these operators are OR, "" (the quote operator), - (the minus operator), and + (the plus operator). More information on these types of operators is available on the Basics of Search page. Many of these special operators are accessible from the Advanced Search page, but some are not. Below is a list of all the special operators Google supports.


and the cache search operator:



If you include other words in the query, Google will highlight those words within the cached document. For instance, [cache:www.google.com web] will show the cached content with the word "web" highlighted.

This functionality is also accessible by clicking on the "Cached" link on Google's main results page.

The query [cache:] will show the version of the web page that Google has in its cache. For instance, [cache:www.google.com] will show Google's cache of the Google homepage. Note there can be no space between the "cache:" and the web page url.


So now you can look for deleted web pages on your own and fill in those research gaps.


Sites that contain more than one copy of a web page:

Archives - It
http://www.archive-it.org/public/all_collections

Wayback Machine
http://www.archive.org/web/web.php


WebCite (Mostly health related pages)

http://www.webcitation.org/query

Other single copy archives include:

Gigablast (Goes Back 1 year from the current calendar date)

http://www.gigablast.com/

Exalead (Goes Back 6 month from the current date)

http://www.exalead.com/


Family-source (Goes back to 2005)

http://www.family-source.com




And the sites below contain futher technical explaination, tools and or guides on how to do cache searching:


http://www.searchengineshowdown.com/others/archive.shtml


http://www.robotstxt.org/faq.html

http://www.onlinemag.net/mar02/OnTheNet.htm

http://web.archive.org/collections/web/advanced.html

http://www.google.com/help/features.html


http://www.google.com/help/operators.html

http://www.googleguide.com/cached_pages.html

http://en.wikipedia.org/wiki/Page_cache

http://help.yahoo.com/l/us/yahoo/search/basics/basics-09.html


http://www.pagefactor.com/

http://urlcut.com/ManyFacesGoogle

http://blog.searchenginewatch.com/blog/060118-165021

http://www.webuildpages.com/cache/cachetoolpublic.pl

http://www.web-caching.com/

http://www.web-cache.com/

http://squidbook.org/index-two.html

http://www.caching.com/

http://www.ircache.net/

http://pages.cs.wisc.edu/~cao/links.html

http://www.w3.org/Propagation/

http://www.mnot.net/cache_docs/

http://vancouver-webpages.com/CacheNow/

http://forskningsnett.uninett.no/arkiv/desire/

http://www-sor.inria.fr/projects/relais/reading-list.html

http://excalibur.usc.edu/

http://www.research.rutgers.edu/~davison/web-caching/bibliography.html


http://www.networkworld.com/netresources/caching.html


http://www.w3.org/Daemon/



Related Post:

Making Pdf Files

Using Emule to get and share information

Making Pdf Files

With the constant changing and disappearing of web sites I have just come to realize how important it is to capture the information the instant you see it and then proceed to archive it or somehow record it.

So, this post will concern software used in creating pdf files.

There are several ways you can make pdf files from web pages.

The first way and easiest is if you have a copy of Adobe Acrobact installed on your computer.

Adobe Acrobat (the current version is number 8)

http://www.adobe.com/products/acrobatpro/acrobatstd.html


Another way to make pdfs is to download software that will install a version of the pdf print driver
on your computer and then saves the file as a pdf.

First you'll need the Ghostscript software:

What is Ghostscript software:




Ghostscript is the name of a set of software that provides:




* An interpreter for the PostScript (TM) language, with the ability to convert PostScript language files to many raster formats, view them on displays, and print them on printers that don't have PostScript language capability built in;




* An interpreter for Portable Document Format (PDF) files, with the same abilities;




* The ability to convert PostScript language files to PDF (with some limitations) and vice versa; and




* A set of C procedures (the Ghostscript library) that implement the graphics capabilities that appear as primitive operations in the PostScript language.




You can download the Ghostscript software from these pages:



http://pages.cs.wisc.edu/~ghost/doc/AFPL/index.htm



Ghostscript and Ghostview:



http://pages.cs.wisc.edu/~ghost


GNU Ghostscript:


http://www.gnu.org/software/ghostscript/ghostscript.html



Post Script Resources:

http://www.geocities.com/SiliconValley/5682/postscript.html


Here's a short list of freeware pdf printdrives:


Bull Zip

http://www.bullzip.com/products/pdf/info.php

Pdf Creator

http://www.pdfforge.org/products/pdfcreator

Conv2Pdf

http://www.conv2pdf.com/


Cute Pdf

http://www.cutepdf.com/Products/CutePDF/writer.asp


Pdf 995

http://www.pdf995.com


Primo Pdf

http://www.primopdf.com

Very pdf Doc 2 Pdf

http://www.verypdf.com/pdfcamp/doc2pdf_readme.html


doPDF Free PDF Converter

http://www.dopdf.com

Pdf 4 Free

http://www.pdfpdf.com/pdf4free.html

Tomahawk PDF+ 2.9.5.0

http://www.nativewinds.montana.com/software/tpdfplus.html

PDF-o-matic

http://www.easysw.com/htmldoc/pdf-o-matic.php


Pdf 4 U

http://www.pdf4u.com

Express Pdf

http://www.expresspdf.com

Kinati 2 Pdf

http://www.k2pdf.com

DocMorph: Electronic Document Conversion

http://docmorph.nlm.nih.gov/docmorph


Now you should be able to quickly take any important web page you see on the internet and turn it into a pdf file and share it with others.

Related post:

MGTOW Torrent


Using eMule to share information

Translate Page Into Your Language

Image Hosted by UploadHouse.com



Image Hosted by UploadHouse.com









del.icio.us linkroll

Archive

Counter

Counter

web tracker

Widget

Site Meter

Blog Patrol Counter