Saturday, November 17, 2007

Online Project to Track Terrorist Websites

Tens of thousands of web pages are now devoted to terrorist propaganda designed to attract followers. On the surface, the messages and videos reveal little about their creators.

But programmers and writers leave digital clues: the greetings and other words they choose, their punctuation and syntax, and the way they code multimedia attachments and web links.

Researchers at the University of Arizona are developing a tool that uses these clues to automate the analysis of 'online jihadism'. The Dark Web project aims to scour Websites, forums and chat rooms to find the Internet's most prolific and influential jihadists and learn how they reel in adherents.

Lab director Hsinchun Chen hopes Dark Web will crimp what he calls "al-Qaida Universities on the Web", the mass of websites where potential terrorists learn their trade, from making explosives to planning attacks.

Experts said they are not aware of any comparable effort, though some said the project may have only limited applications.

The project in the university's Artificial Intelligence Lab will not identify people outside cyberspace "because that involves civil liberties," Chen said, preferring to let law enforcement and intelligence analysts take over from there.

Instead, it will help identify messages with the same author and reveal links that aren't obvious.

The bulk of a $1.3 million grant the National Science Foundation gave Chen's group will focus on who produces improvised explosives and what they talk about – such as American troop movements and terrorist tactics. Before getting the NSF funding, Chen started the project with $3 million from other Artificial Intelligence Lab programs.

Dark Web's software, which Chen calls Writeprint, samples 480 different factors to identify whether the same people are posting to multiple radical forums. It can analyze everything from a fragment of an email to videos depicting American soldiers blown up in Humvees and fuel tankers.

Writeprint is derived from a program originally used to determine the authenticity of William Shakespeare's works. It looks at writing style, word usage and frequency and greetings, and at technical elements ranging from web addresses to the coding on multimedia attachments. It also looks at linguistic features such as special characters, punctuation, word roots, font size and color.

Currently, intelligence analysts cannot effectively analyze writing style in cyberspace, particularly multilingual writings, he said.

Chen and counterterror specialists said what he termed a tenfold increase in the last two years in jihadist content appearing online has outstripped intelligence analysts' abilities.

Dark Web compares writings it finds to others in its logs of about 500 million pages of jihadist-produced documents, videos, images, e-mails and other postings, Chen said.

Most of the material is in Arabic, but as terrorist sympathizers have spawned new sites worldwide since 2005, Dark Web has expanded to look at Chinese-, Spanish- and French-language postings, and others will be added.