Research
Research Groups > Database Management and Information Retrieval
|
|
|||||||||||
It's a set of urgent problems: How can we best organize, store, access, and utilize the massive amounts of electronic data that people generate each day? Fortunately, CCIS has a dedicated, highly experienced team that's tackling these concerns from all angles—and coming up with some of the industry's most inventive solutions. Betty Salzberg, who is known for developing the Holy Brick Tree, a spatial index, has created algorithms for grouping data with space and time attributes. She and Panfeng Zhou, PhD'06, conducted research on how to improve the Holy Brick Tree to handle empty space gracefully. In collaboration with David Lomet of Microsoft Corporation, Salzberg and graduate student Jing Shan also developed the patent-pending C-tree, which can be used to index spatial objects, such as lakes, rivers, or road segments. This can be used in geographic systems to determine which roads cross rivers and which river is closest to a given location, for instance, or to provide other geometric information. Donghui Zhang is also concerned with spatial indexing, but primarily focuses his research on advanced query processing using spatial index structures. As a researcher interested in information retrieval, Javed Aslam considers similar problems from a different point of view. He studies efficient ways to extract information from databases. One of his current projects is developing an efficient technique for evaluating the quality of a search engine's results. Current techniques involve reading and assessing hundreds or thousands of retrieved Web pages and documents. With graduate students Virgil Pavlu and Emine Yilmaz, and a recent three-year NSF grant, Aslam is developing efficient techniques based on random sampling that can reduce this effort by a factor of 20 or more. "These techniques should make it much easier for search engines to assess, and ultimately improve, the quality of their results," says Aslam. Robert Futrelle, another CCIS information-retrieval expert, is focused on the characterization and extraction of knowledge from biomedical literature. The goal is to allow users of all backgrounds to do contentbased retrieval from such literature to extract information on anything from molecular biology research to the results of clinical studies. "Figures such as data graphs and images occupy an important part of every research paper in biomedicine," says Futrelle, whose project is one of the only ones in the world that includes figure content as first-class material. Kenneth Baclawski's research, in turn, focuses on ontology-based computing including the Semantic Web, which is a layer above the World Wide Web that understands the meaning of information and can make valid inferences about it. He's been involved in the development of the Semantic Web since it started and is actively applying it in the area of health sciences in collaboration with Professor Tianhua Niu of the Harvard Medical School. "Single nucleotide polymorphisms (SNPs) are the most frequent DNA sequence variations in the human genome," Baclawski explains. "SNPs have a wide range of biomedical applications and because of the complexity of SNP data, ontology for SNP information is being developed using the Web Ontology Language." Aslam, Baclawski, Futrelle, Salzberg, and Zhang often serve on the committees of top conferences, including the Association for Computing Machinery's |
||||||||||||
|
||||||||||||