• Development Of An Information Retrieval System Using Tree-structured Clustering
    [FRSC Benue State]

  • CHAPTER ONE -- [Total Page(s) 2]

    Page 1 of 2

    1 2    Next
    • CHAPTER ONE INTRODUCTION
      1.1    Background of the study
      An Information Retrieval System is a system that is capable of storage, retrieval and maintenance of information, the general objective of an Information Retrieval System is to minimize the overhead of a user locating needed information. Overhead can be expressed as the time a user spends in all of the steps leading to reading an item containing the needed information, the two major measures commonly associated with information systems are precision and recall. Information Retrieval (IR) is a large and growing field within Natural Language Processing (Magnus,2006). A cluster or allocation unit as it was formally called is referred to as the smallest logical amount of disk space that can be allocated to hold a file or directory. Hence, cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called clusters) are more similar (in some sense or another) to each other than those in other groups. It is a main task of exploratory data mining, and a common technique to statistical data analysis used in many fields, including machine learning, pattern recognition, image analysis, information retrieval and bioinformatics (Linus, 2014). Cluster analysis itself is not one specific algorithm but the general task to be solved.
      Cluster analysis is a technique that assigns items to automatically created groups based on a calculation of the degree of association between items and groups. In the information retrieval (IR) field, cluster analysis has been used to create groups of documents with the goal of improving the efficiency and effectiveness of retrieval, or to determine the structure of the literature of a field. The terms in a document collection can also be clustered to show their relationships. The two main types of cluster analysis methods are the nonhierarchical, which divide a data set of N items into M clusters, and the hierarchical, which produce a nested data set in which pairs of items or clusters are successively linked. The nonhierarchical methods such as the single pass and reallocation methods are heuristic in nature and require less computation than the hierarchical methods. Clustered files are often suggested as a way to cut down search time in similarity-based systems (Caroline and Stephen). In such an organization, similar documents are grouped together in clusters, and only the most promising clusters are examined.
      The cluster hypothesis states the fundamental assumption we make when using
      clustering in information retrieval.
      Cluster hypothesis. “Documents in the same cluster behave similarly with respect to relevance to information needs.” The hypothesis states that if there is a document from a cluster that is relevant to a search request, then it is likely that other documents from the same cluster are also relevant (Linus, 2014). This is because clustering puts together documents that share many terms. In both cases, we posit that similar documents behave similarly with respect to relevance.
      Tree clustering is a form of clustering algorithm that joins together objects successively into clusters, using some measures of similarity or distance. A typical example of this kind of clustering is the hierarchical tree. Hierarchical clustering is based on the core idea of objects being more related to nearby objects than to objects farther away. As such these algorithms connect objects to form clusters based on their distances.
      1.2    Statement of the Problem
      Although, Benue state is still a developing state but this have not really affected the increasing number of vehicles owners in the state, and this means more work for the federal road safety corps in Benue state, there is need for a clear statistics of vehicle owners in a particular local government and the Benue state in general, to combat the menace of fake vehicle registration, false driving license, vehicle theft and so on, and ensuring road rules and regulation are kept by road users through proper registration and monitoring, and to achieve this, a means of advance storage, processing and easy retrieval of information system is required , with this in mind, this study becomes very necessary as it will improve vehicle registration process as well as ensure quick and easy access to registered vehicles and their owners information by the FRSC anywhere and anytime in the state, this system will to a large extent reduce the challenges and restriction associated with the use of  the manual process of registering vehicles.

  • CHAPTER ONE -- [Total Page(s) 2]

    Page 1 of 2

    1 2    Next
    • ABSRACT - [ Total Page(s): 1 ]Coming Soon ... Continue reading---

         

      APPENDIX A - [ Total Page(s): 2 ]REGISTRATION PAGE ... Continue reading---

         

      CHAPTER TWO - [ Total Page(s): 4 ]2.3    Hierarchical Agglomerative ClusteringHierarchical Agglomerative Clustering (compare) is a similarity based bottom-up clustering technique in which at the beginning every term forms a cluster of its own. Then the algorithm iterates over the step that merges the two most similar clusters still available, until one arrives at a universal cluster that contains all the terms.In our experiments, we use three different strategies to calculate the similarity between clusters: com ... Continue reading---

         

      CHAPTER THREE - [ Total Page(s): 7 ]Quality improvement and cost reduction:platform.due to a central communicationv.        Use of Less Space for Record Storage: There will be elimination of much space used in storing records by introducing a computer storage media (disks) which can keep vast volume of information in a less space.vi.Speed Optimization:This will eliminate the problems of time wasting in registering records, checking from one line to the next as well as preparing a revenue report which is faster than using man ... Continue reading---

         

      CHAPTER FOUR - [ Total Page(s): 2 ]CHAPTER FOURRESULT AND IMPLEMENTATION4.1    IntroductionSystems design could be seen as the application of systems theory to product development. According to Wikipedia it is defined as the process of defining the architecture, components, modules, interfaces, and data for a system to satisfy specified requirements.4.2    System RequirementIn developing any system, there is need to specify some system requirements for minimum performance. However, with respect to this work the system requi ... Continue reading---

         

      CHAPTER FIVE - [ Total Page(s): 1 ]CHAPTER FIVESUMMARY, CONCLUSION AND RECOMMENDATION5.1    SUMMARYThis project work is aimed at providing a software model for grouping a set of related records in the Federal Road Safety Commission. The system has been designed to automate data for which vehicle owners are being registered. Consistency, reliability, fairness and quick turnaround time is ensured with the use of this system. Based on the model used in this software, further improvements can be made in order to include other feat ... Continue reading---

         

      REFRENCES - [ Total Page(s): 1 ]REFERENCES1.    William B. Frakes and Ricardo Baeza-Yates.(1992). Information Retrieval    Data Structures & Algorithms. Prentice-Hall, Inc. ISBN 0-13-463837-9.2.    Ahmad, A. and Dey, L. (2007). A method to compute distance between two categorical values of some attributes in unsupervised learning for categorical data set.3.    Anderberg M.R. (1973). Cluster Analysis for Applications. Academic Press, New York.4.        Chandola Varun, Boriah Shyam and Kumar Vipin (2007). Simil ... Continue reading---