• A System For Health Document Classification Using Machine Learning

  • CHAPTER ONE -- [Total Page(s) 2]

    Page 2 of 2

    Previous   1 2
    • 1.4    SCOPE OF THE STUDY
      As stated earlier, statistical pattern recognition, or neural network are used in classifying documents, this project work will concentrate on using machine learning algorithm to classify document.
      1.5    SIGNIFICANCE OF THE STUDY
      The software delivered from this project work will greatly reduce the time used by doctors, physicians and other health workers in searching and retrieving documents.
      Other importance of this project work includes:
      1.    Helps students and other interested individuals that want to develop a similar application.
      2.    It will serve as source of materials for those interested in investigating the processes involved in developing a document classification system using machine learning.
      3.    It will serve as source of materials for students who are interested in studying machine learning.
      1.6    DEFINITION OF TERMS
      Document Classification: is the task of grouping documents into categories based upon their content.
      Health Document: A health certificate is written by a doctor and displays the official results of a physical examination.
      Machine Learning: the study and construction of algorithms that can learn from and make predictions on data.
      JSP: Java Server Pages is a java technology for creating dynamic web pages.
      HTML: Hyper Text Markup Language for creating web-pages.
      MYSQL: A database management system for creating, storing and manipulating databases.
      SERVLET: is a small pluggable extension to a Server that enhances the Server’s functionality.
      BOOTSTRAP: is a sleek, intuitive, and powerful mobile first front-end framework for faster and easier web development. It uses HTML, CSS and Javascript.
      1.7    ORGANIZATION OF WORK
      Chapter one introduces the background of the project with the statement of the problems, objectives of the project, its significance, scope, and constraints are pointed out.
      Chapter two reviews literatures on machine learning, document classification and the review of related literature.
      Chapter three discusses system Investigation and Analysis. It deals with detailed investigation and analysis of the existing system and problem identification. It also proposed for the new system.
      Chapter four covers the system design and implementation. Chapter five was the summary and conclusion of the project.

  • CHAPTER ONE -- [Total Page(s) 2]

    Page 2 of 2

    Previous   1 2
    • ABSRACT - [ Total Page(s): 1 ]ABSTRACTDue to the massive increase in medical documents every day (including books, journals, blogs, articles, doctors' instructions and prescriptions, emails from patients, etc.), it is becoming very challenging to handle and to categorize them manually. One of the most challenging projects in information systems is extracting information from unstructured texts, including medical document classification. The discovery of knowledge from medical datasets is important in order to make effective ... Continue reading---

         

      APPENDIX A - [ Total Page(s): 2 ]APPENDIX A ... Continue reading---

         

      APPENDIX C - [ Total Page(s): 1 ]APPENDIX Cen-diseases.trainMalaria is a life-threatening mosquito-borne blood disease caused by a Plasmodium parasite Malaria was eliminated from the U.S. in the early 1950sMalaria is typically spread by mosquitoesMalaria symptoms can be classified into two categoriesMalaria happens when a bite from the female Anopheles mosquito infects the body with PlasmodiumMalaria is a mosquito-borne infectious disease affecting humans and other animals caused by parasitic protozoansMalaria is a mosquito-bor ... Continue reading---

         

      APPENDIX B - [ Total Page(s): 11 ]APPENDIX B ... Continue reading---

         

      CHAPTER TWO - [ Total Page(s): 3 ]CHAPTER TWOLITERATURE REVIEW2.0    DOCUMENT CLASSIFICATIONClassification can be divided in two principal phases. The first phase is document representation, and the second phase is classification. The standard document representation used in text classification is the vector space model. The difference of classification systems is in document representation models. The more relevant the representation is, the more relevant the classification will be. The second phase includes learning from tr ... Continue reading---

         

      CHAPTER THREE - [ Total Page(s): 3 ]3.4    SEQUENCE DIAGRAMSequence diagrams are simple subsets of interaction diagrams. They map out sequential events in an engineering or business process in order to streamline activities. Sequence diagrams are used to show how objects interact in a given situation. An important characteristic of a sequence diagram is that time passes from top to bottom: the interaction starts near the top of the diagram and ends at the bottom (i.e. Lower equals Later).3.5    CLASS DIAGRAMSWe begin our OOD ... Continue reading---

         

      CHAPTER FOUR - [ Total Page(s): 5 ]CHAPTER FOUR SYSTEM IMPLEMENTATION4.0    INTRODUCTIONAfter careful requirement gathering, analysis and design, the system is implemented. Implementation involves testing the system with required data and observing the results to see if the system has been properly deigned or if it contains bugs. This is usually done with data which has known results. In this chapter we will implement the system designed.4.1    SYSTEM REQUIREMENTSTo implement the application, the computer on which it will r ... Continue reading---

         

      CHAPTER FIVE - [ Total Page(s): 1 ]CHAPTER FIVE SUMMARY AND CONCLUSION5.0    INTRODUCTIONThis chapter summarizes and concludes the project work; it also gives recommendations and insight to future work.5.1    SUMMARYIn this project work we were able to succeed in applying Natural Language Processing which is a branch of Machine Learning to Classifying Health related documents. We made use of the OpenNLP Application Programming Interface which is a Java API for training a model and classifying the documents. We make use of M ... Continue reading---

         

      REFRENCES - [ Total Page(s): 1 ]REFERENCERussell Power, Jay Chen, Trishank Karthik and Lakshminarayanan Subramanian (2018),“Document Classification for Focused Topics” https://cs.nyu.edu/~jchen/publications/aaai4d-power.pdf.Hull D., J. Pedersen, and H. Schutze (1996), “Document routing as statistical classification,” in AAAI Spring Symp. On Machine Learning in Information Access Technical Papers, Palo Alto.Fox C. (1992), “Lexical analysis and stoplist,” in Information Retrieval Data Structur ... Continue reading---