Renu
Department of Computer Science, Shri Ram college of Engineering & Management, Palwal
Neha
Department of Computer Science, Shri Ram college of Engineering & Management, Palwal
Kunal
Department of Computer Science, Shri Ram college of Engineering & Management, Palwal
Download PDFText Summarization is condensing the source text into a shorter version preserving its information content and overall meaning. It is very difficult for human beings to manually summarize large documents of text. Text Summarization methods can be classified into extractive and abstractive summarization. An extractive summarization method consists of selecting important sentences, paragraphs etc. from the original document and concatenating them into shorter form. The importance of sentences is decided based on statistical and linguistic features of sentences. An abstractive summarization method consists of understanding the original text and re-telling it in fewer words. It uses linguistic methods to examine and interpret the text and then to find the new concepts and expressions to best describe it by generating a new shorter text that conveys the most important information from the original text document. Usually, the flow of information in a given document is not uniform, which means that some parts are more important than others. The major challenge in summarization lies in distinguishing the more informative parts of a document from the less ones. Though there have been instances of research describing the automatic creation of abstracts, most work presented in the literature relies on verbatim extraction of sentences to address the problem of single-document summarization. In this scheme, we describe some eminent extractive techniques. First, we look at early work from the aspect of research on summarization. Second, we concentrate on approaches involving machine learning techniques. In this dissertation, ontology based document summarization is proposed that provide efficient and accurate summary than other approaches. The main motivation for summarization is to identifying summary from a large document, that it is a data is beneficial for us or not. It is identify weather a product is purchasable or not. This make difficult for a potential customer to read them to make an informed decision on whether to purchase the product. It also makes it difficult for the manufacturer of the product to keep track and to manage customer opinion. In the scheme we proposed enhanced algorithm vide latent semantic kernel for better results.
Keywords: Data or Text Summarization; Inverse Document Frequency; Document Clustering
Disclaimer: All papers published in IJRST will be indexed on Google Search Engine as per their policy.