The Effectiveness of Internet Content FIlters

The Effectiveness of Internet Content FIlters

Report Number
746
Authors
P.B. Stark
Abstract

As part of its defense of the Child Online Protection Act, which seeks to prevent minors from viewing commercially published harmful-to-minors material on the World Wide Web, the U.S. Department of Justice commissioned a study of the prevalence of “adult” materials and the effectiveness of Internet content filters in blocking them. As of 2005–6, about 1.1% of webpages indexed by Google and MSN were adult—hundreds of millions of pages. About 6% of a set of 1.3 billion searches executed on AOL, MSN and Yahoo! in summer 2005 retrieve at least one adult webpage among the first ten results, and about 1.7% of the first the results are adult webpages. These estimates are based on simple random samples of webpages indexed by search engines and a stratified random sample of searches. Webpages with sexually explicit content intended for adult entertainment (i.e., not in an educational, medical or artistic context) were used to test a variety of Internet content filters for underblocking—failing to block webpages that they are intended to block. A random sample of “clean” webpages with no sexual content or reference to sex was used to test the filters for overblocking—blocking webpages they are not intended to block. Webpages retrieved by the most popular searches according to Wordtracker were also categorized and used to test filters. Generally, filters with lower rates of underblocking had higher rates of overblocking. If the filter most effective at blocking adult materials were applied to search indexes, typical query results or the results of popular queries, the number of clean pages blocked in error would exceed the number of adult pages blocked correctly.

PDF File