How does it work
Docs Detective uses sophisticated algorithms to process and compare documents, and provides the results in a way that makes comparison easy and quick. This document gives a high level overview of the plagiarism detection process, however, it’s simplified, and we have excluded much of the secret sauce.
Portions of the document are automatically sent to the Google Search API, which results a list of documents that have matching text. The results of multiple searches are combined into one list of documents.
The documents are then downloaded to the App Engine server, where the text is extracted.
Using algorithms and data structures optimized for comparing text, a document can compared to several hundred web documents in just a few seconds.
To make comparing simple and quick, web documents that match a section of plagiarized text are grouped together. Statistics about each section of text are calculated.