U Not Gonna Like This: how

Showing posts with label how_tos. Show all posts

Reviving Dead Web Pages

2/09/2008 by mezzrowjr how_tos , information , internet , link_bait

Learing how to retrive web sites that no longer exist from a search engines cache may turn out to be one of the most important skills you learn. Being able to revive a web page that might still be cached by a search engine can help you bridge potentially important gaps in information and research.

One way to do this and the easiest by far is using the Way Back Machine at the Internet Archives (link)

However, the Way Back Machine *does not* have a copy of every web page ever uploaded to the web. Some site owners specifically request that thier sites not be crawled for inclusion in the archives. While other webmasters will include small lines of code the block the crawlers (better known as robots.) So once that "blocked" page disappears it's gone forever.

But you still might be able to find other pages that are just as important and here's how you do that:

From The Google help center (link):

Advanced Operators

Google supports several advanced operators, which are query words that have special meaning to Google. Typically these operators modify the search in some way, or even tell Google to do a totally different type of search. For instance, "link:" is a special operator, and the query [link:www.google.com] doesn't do a normal search but instead finds all web pages that have links to www.google.com.

Several of the more common operators use punctuation instead of words, or do not require a colon. Among these operators are OR, "" (the quote operator), - (the minus operator), and + (the plus operator). More information on these types of operators is available on the Basics of Search page. Many of these special operators are accessible from the Advanced Search page, but some are not. Below is a list of all the special operators Google supports.

and the cache search operator:

If you include other words in the query, Google will highlight those words within the cached document. For instance, [cache:www.google.com web] will show the cached content with the word "web" highlighted.

This functionality is also accessible by clicking on the "Cached" link on Google's main results page.

The query [cache:] will show the version of the web page that Google has in its cache. For instance, [cache:www.google.com] will show Google's cache of the Google homepage. Note there can be no space between the "cache:" and the web page url.

So now you can look for deleted web pages on your own and fill in those research gaps.

Sites that contain more than one copy of a web page:

Archives - It
http://www.archive-it.org/public/all_collections

Wayback Machine
http://www.archive.org/web/web.php

WebCite (Mostly health related pages)

http://www.webcitation.org/query

Other single copy archives include:

Gigablast (Goes Back 1 year from the current calendar date)

http://www.gigablast.com/

Exalead (Goes Back 6 month from the current date)

http://www.exalead.com/

Family-source (Goes back to 2005)

http://www.family-source.com

And the sites below contain futher technical explaination, tools and or guides on how to do cache searching:

http://www.searchengineshowdown.com/others/archive.shtml

http://www.robotstxt.org/faq.html

http://www.onlinemag.net/mar02/OnTheNet.htm

http://web.archive.org/collections/web/advanced.html

http://www.google.com/help/features.html

http://www.google.com/help/operators.html

http://www.googleguide.com/cached_pages.html

http://en.wikipedia.org/wiki/Page_cache

http://help.yahoo.com/l/us/yahoo/search/basics/basics-09.html

http://www.pagefactor.com/

http://urlcut.com/ManyFacesGoogle

http://blog.searchenginewatch.com/blog/060118-165021

http://www.webuildpages.com/cache/cachetoolpublic.pl

http://www.web-caching.com/

http://www.web-cache.com/

http://squidbook.org/index-two.html

http://www.caching.com/

http://www.ircache.net/

http://pages.cs.wisc.edu/~cao/links.html

http://www.w3.org/Propagation/

http://www.mnot.net/cache_docs/

http://vancouver-webpages.com/CacheNow/

http://forskningsnett.uninett.no/arkiv/desire/

http://www-sor.inria.fr/projects/relais/reading-list.html

http://excalibur.usc.edu/

http://www.research.rutgers.edu/~davison/web-caching/bibliography.html

http://www.networkworld.com/netresources/caching.html

http://www.w3.org/Daemon/

Related Post:

Making Pdf Files

Using Emule to get and share information

0 comments

Making Pdf Files

1/20/2008 by mezzrowjr documents , how_tos , information , link_bait , technology

With the constant changing and disappearing of web sites I have just come to realize how important it is to capture the information the instant you see it and then proceed to archive it or somehow record it.

So, this post will concern software used in creating pdf files.

There are several ways you can make pdf files from web pages.

The first way and easiest is if you have a copy of Adobe Acrobact installed on your computer.

Adobe Acrobat (the current version is number 8)

http://www.adobe.com/products/acrobatpro/acrobatstd.html

Another way to make pdfs is to download software that will install a version of the pdf print driver
on your computer and then saves the file as a pdf.

First you'll need the Ghostscript software:

What is Ghostscript software:

Ghostscript is the name of a set of software that provides:

* An interpreter for the PostScript (TM) language, with the ability to convert PostScript language files to many raster formats, view them on displays, and print them on printers that don't have PostScript language capability built in;

* An interpreter for Portable Document Format (PDF) files, with the same abilities;

* The ability to convert PostScript language files to PDF (with some limitations) and vice versa; and

* A set of C procedures (the Ghostscript library) that implement the graphics capabilities that appear as primitive operations in the PostScript language.

You can download the Ghostscript software from these pages:

http://pages.cs.wisc.edu/~ghost/doc/AFPL/index.htm

Ghostscript and Ghostview:

http://pages.cs.wisc.edu/~ghost

GNU Ghostscript:

http://www.gnu.org/software/ghostscript/ghostscript.html

Post Script Resources:

http://www.geocities.com/SiliconValley/5682/postscript.html

Here's a short list of freeware pdf printdrives:

Bull Zip

http://www.bullzip.com/products/pdf/info.php

Pdf Creator

http://www.pdfforge.org/products/pdfcreator

Conv2Pdf

http://www.conv2pdf.com/

Cute Pdf

http://www.cutepdf.com/Products/CutePDF/writer.asp

Pdf 995

http://www.pdf995.com

Primo Pdf

http://www.primopdf.com

Very pdf Doc 2 Pdf

http://www.verypdf.com/pdfcamp/doc2pdf_readme.html

doPDF Free PDF Converter

http://www.dopdf.com

Pdf 4 Free

http://www.pdfpdf.com/pdf4free.html

Tomahawk PDF+ 2.9.5.0

http://www.nativewinds.montana.com/software/tpdfplus.html

PDF-o-matic

http://www.easysw.com/htmldoc/pdf-o-matic.php

Pdf 4 U

http://www.pdf4u.com

Express Pdf

http://www.expresspdf.com

Kinati 2 Pdf

http://www.k2pdf.com

DocMorph: Electronic Document Conversion

http://docmorph.nlm.nih.gov/docmorph

Now you should be able to quickly take any important web page you see on the internet and turn it into a pdf file and share it with others.

Related post:

MGTOW Torrent

Using eMule to share information

0 comments

U Not Gonna Like This

Reviving Dead Web Pages

Making Pdf Files

Translate Page Into Your Language

del.icio.us linkroll

Out Bound Links

Archive

Counter

Counter

Widget

Site Meter

Blog Patrol Counter

About Me