Document Clustering

Clustering is a method used to analyse the textual attributes of a set of texts and to create groups based on their similarities. It helps researchers understand which documents are more alike, or different, based on their features or attributes, rather than their content.  

Clustering can be used to determine whether there are subsets or groups within a Content Set that aren’t easily uncovered by metadata or other filtering, or perhaps by other formal textual characteristics. For instance, a Content Set composed of articles might have different types or lengths of articles which could appear as distinct clusters despite their shared metadata values. 

Learn more about Document Clustering and how to use it to analyse texts in Gale Digital Scholar Lab. 

DISCLAIMER

Any views and opinions expressed in these essays are those of the author in question, and any views or opinions from the original source material are those of the publication in question. Gale, part of Cengage Group, provides facsimile reproductions of original sources and does not endorse or dispute the content contained in them. Author affiliation and information within them are correct as of the original publication date.