GhostDetect works by extracting various features from a document including measures of document complexity, parts of speech, and function words. Using a corpora of documents, these are normalized and principle components analysis is used to determine vectors which are the most discriminatory amongst documents. The most discriminatory 18 vectors are used to control the dimension of the face.