Font Size: SmallerFont Size: DefaultFont Size: Larger
  • 日本語トップ

Information Services Platform Laboratory

  • Print this page

Cross Database Search

In an effort to leverage the ever-growing Big Data to the data-intensive science of the “fourth paradigm”, an advanced cross-DB search engine is proposed, which facilitates searching of inter-disciplinary correlating datasets from large-scale, multi-domain and heterogeneous databases like World Data System (WDS). Most of the other conventional portal systems only provide simple keyword-based search forcing users to know the data sets in advance and examine their relations by themselves. In contrast, our technology aims at searching spatiotemporal, ontological and citational correlating datasets. Some examples are: the data in surrounding areas and the time periods of a particular disaster, the data related to natural/social events known to be caused by the disaster, and the data typically-cited from the documents describing similar disasters, respectively.

An innovative complex join of these different types of correlations based on evolutionary computing has being developed. Data Citation (DC) allows to link from textual document to scientific data for facilitating data reproduction and provenance. The Cross-DB search engine interacts with a data citation-mining module to find the citational correlations. Current system is being developed to search the WDS, one of the largest science databases geographically distributed in more than 100 sites in the world, web archives on the scale of several billion pages and physical/social sensing data.