Saturday, November 29, 2008

week 13 readings

No Place to Hide:
What a dismal chapter!  You know, I do not, and have never, believed that we had to give away privacy or "freedom" in exchange for increased security.  the recent attacks in Mumbai just remind me that the US isn't the only target of terrorism.  Violence and terrorism is worldwide, and I do believe are not "trying to keep us safe" but are instead gleefully reaping profits by motivating us with fear.  I felt sick when I read the quote, "We want you to recognize the economic opportunity that homeland security presents. It is important for all Americans to remember that when the terrorists struck on September 11, 2001, one of their goals was to cripple the U.S. economy. We must remember this and change our mindset to make protecting the homeland a mission that moves our economy forward."
First of all: the use of the word homeland.  Only Native Americans can use this word correctly.  Only they are indigenous.  Next, when people invoke 9/11 to make a profit.  This is just wrong.  

TIA:  ONly if these programs were run perfectly, they might work as intended.  But, they are human-created and human-run.  The humans who run it will make human decisions and human errors.  We will compile huge archives of everyday comings and goings, and while pursuing information in the name of security, many things which are not the business of the government or contractors, and which are still technically protected, will be noted and recorded.  In some grey area on some grey day, this information could be used against us, or to control us.  At the very worst, we are manipulated and don't know it.  At the best, it is creepy that one may someday find out it has been recorded, all the dumb things one did one random day.

Youtube video: has been removed due to copyright claim of Viacom.

Wednesday, November 12, 2008

Week 11 readings/muddiest point

Deep Web: Michael Bergman

This is an interesting article on 'deep web."  I'm still not sure how it gets such a fancy name.  I'm still not sure exactly what it is.  For example, they listed ebay as a deep-web page.  I can get to that via search engine very easily.  In the web, it seems that deep web will not look any different than surface web.  In fact, all web was deep web until maybe the advent of search engines.
Maybe it is differentiated because search engines can reach only about 16% of the web.  It is a shame, because there is 400-550 more times the public information on the deep web.  To conclude, I surmise that the deep web is not inaccessible, just not randomly searched.  People who use the deep web know where they're going and so don't google it.  Though I could be wrong.

Web Search Engines: part 1 /David Hawking

The premise here is that web search engines provide high quality information quickly. They cannot and should not attempt to index the web in its entirety. Indexing begins with a "seed" Url.   The search engine can then search inside the seed (ex: topics within wikipedia).
Different search computers search different areas, and forward search requests to the machines that are assigned it.  They also make sure that web browsers are not overwhelmed with requests by adding a politeness delay to make sure each request goes in 1 at a time. 
Robots do not-recrawl over all the web.


Web search Engines part 2/David Hawking

The vocabulary of the web includes many languages, including new words specific to internet culture, and also includes misspellings and grammatical errors.
Most queries are 2 words long. All query searches include all query words.
Search engines have strategies to speed up searches: they can skip, make lists of decreasing value, assign number scores according to their decreasing value.  They can cache: pre-store anticipate search answers.

OAI Protocol for Metadata harvesting: 
I think this about metadata and steps taken to be able to comprehensively search it?  Seriously, I'm lost.

Muddiest Point: Is there something I'm missing about the deep web?  Why is it not linked to search engines?
Could someone explain OAI to me in simple language?

Friday, November 7, 2008

Week 10 readings

Digital Libraries: challenges and Influential Work by William Mischo

This is an interesting article explaining the beginnings of research into digital libraries, i.e. having some sort of methodology programmed into how one searches  the web (putting the books back on the bookshelf if you will).
From government grants for a few schools with different needs, other institutions have adopted what was done in these studies, and google was born.  
Now, the need is to build something akin to Google scholar in digital library form.

Dewey Meets Turing: Libraries, computer Scientists, and the DLI

Hmmm.  So, computer scientists and librarians could have gotten along rather well together, systematically organizing a digital library, but the pesky web was born and grew up so fast and stressed the relationship.  Enter publishers and computer scientists can no longer make their discoveries open and free, but can only tease their colleagues with their new programs.
Meanwhile, librarians needs are not met and librarians think that computer scientists don't understand them and have forgotten about them, and computer scientists just wish librarians were more like computer scientists!

Institutional Repositories

I agree with the author that it does not make sense to have authors, especially of academic institutions to be in charge of posting their work online and archiving it.  This is not their job, and if forced to do it, will often be sub-par to what a centralized, specialized department could do with these works.
I also agree that they can be useful as collectors of ephemera.
But: they cannot claim ownership over a faculty or students work.
And cumbersome gate-keeping policies will be, in the words of the author, counterproductive. The author believes in keeping it simple.  Too much policy would undermine its effectiveness.
Also, as important as they are, let us not be too hasty in their implementation, but let us be thoughtful and end up with a useable, sane, repository.