w 24 | ocus cnrsI InternatIonal magazIne F Coping with aData Deluge Trepresents five million times 04 05oday, the amount of informationexchanged over the Internet that contained in all the books ever written.How can this ever-expanding mass of information be analyzed? Marie- Christine Rousset, computer science professor and LIG1 member, is among those in the scientific community at- tempting to structure the continuous flow of data across the Web. “The pages IM we look at every day are part of the text- based Web, which contains billions of interconnected documents,” she explains. © “These pages cannot be used as a genuine knowledge base since they were designed 06 07 to be read by humans, not machines.” In other words, when entering a query in a search engine like Google, all it does is provide a list of thousands of documents likely to match it. It is then up to the user to laboriously search for the most relevant response. Given the dizzying rate at which documents are published on the YlBa t Web, this type of search model may soon lecI B prove inefficient to manage such large a Pus tsense amounts of data. The alternative is to e upgrade the existing Web to a data Web: © onnéesD “This approach is based on adding meta- D I data to the URL addresses that identify url. etr Web pages. It aims to simplify the Web (Uniform Resource M by organizing its information, thus Locator) sIc A string of esne/ granting end-users easier access to characters for H knowledge,” explains Rousset. locating a Web page This development is already under- or website. oH.. © way through W3C, the international 08 vIsuAlIzATIonToImProvE unDErsTAnDInG The profusion of data now available to Such representations enable researchers to researchers is not always an advantage: the analyze the structure of these masses of more information available, the harder it is to information both quickly and efficiently. interpret. At the Bordeaux Computer Science Although exponential growth in computing Research Laboratory (LABRI),1 David Auber and power has generated a considerable quantity of his team are trying to make this deluge of data over the past ten years, our brain’s ability to information more legible using analytical process that information lags far behind. “Our visualization methods. “Our approach is to apply short-term memory prevents us from analyzing mathematical tools like algorithms to sift more than seven things at once,” Auber stresses. through this raw data and extract the most The principle of analytical visualization—via pertinent information,” explains the researcher. interfaces to help with data analysis—may soon Using this method, stock prices, communications become essential to bridge the gap. systems, chemical processes in cell metabolism, 01.laboratoireBordelaisderechercheeninformatique and geographical or social networks can be (cnrs/universitéBordeaux-Iand-II/université translated into visual metaphors. Bordeaux-segalen/IPB). contactInFormatIon: 08 a map of communications between David Auber I r B la 20,000 computers, developed atlabrI. > david.auber@labri.fr ©
CIM28
To see the actual publication please follow the link above