AI content search module

We have noticed that students and educators are actively using various mechanisms and tools for content creation, including AI tools.

AI tools have become part of the educational process, although they have only recently appeared. Students and teachers are using them because they are very efficient, fast and have access to significant amounts of information. However, there is some risk associated with the use of AI tools.

Our company decided to create a module that would meet the needs of educational institutions, organizations and publishing houses. An educational institution, having an effective tool to counteract abuses that may arise when using ChatGPT, Bard and other AI tools, will be able to better protect students from violating the principles of academic integrity and protect the quality standards of education.


The AI content report is placed inside the interactive similarity report, which is very convenient to analyse. It is also convenient to evaluate the document on two criteria at once and leave comments related to both AI and plagiarism.


By clicking on Details in the AI Content Search section, you will be able to open the second report.

Colors in AI Probability Report

Our report traces both the AI probability ratio and the AI probability for each text fragment by colouring the fragments. Each colour represents the probability of whether the text is written by an AI or a human. The report shows a list of fragments and the AI Probability Coefficient for each fragment.


If the text is green, the probability that it is machine written is minimal, if it is red, the probability that it is machine written is maximum. 

These colours cannot be changed manually, accepted or rejected. The probability that the text is machine written is checked by the modules and algorithms that are the best at the moment. 

What does AI content probability (probability coefficient) mean?

AI Content Probability Coefficient is a prediction of the likelihood of whether the whole entire text was generated by AI or written by a human. The coefficient is not a measure of the ratio of AI-generated text to the original content of the document.
If the author of an paper has a low Similarity Coefficient but a high AI Content Probability Coefficient, this is most likely a false response from the system, so the document should be analyzed in detail.

AI content Indicator
To see in the report only the fragments with high probability of AI we designed an additional tool. The user can now easily display only those fragments which are mostly interesting for analyzing, for example, those where the AI probability coefficient exceeds 60% or even 80%. Meantime, we use a threshold value for AI content indicator. After enabling it the system highlights only the fragments exceeding threshold value. 

How does AI detection work?


The module applied supervised learning using several models, including a modified BERT model, to predict whether content is artificial or original. The artificial intelligence was presented with millions of texts of both AI and original content and then trained to determine the difference between the two. After each training session, a large set of test data is used to evaluate whether the new model is an improvement or not.

Linguistic analysis
Since the content created by AI is generated based on templates, it is not surprising that in it you can notice repeated phrases, strange syntax or the absence of nuances characteristic of human writing.

Statistical analysis
Many platforms use statistical models to assess text complexity, sentence structure, and vocabulary usage to determine whether the text was written by AI. Texts created by AI often have a uniform sentence structure and length, which distinguishes them from human writing.

Machine learning models
Machine learning models (for example, Originality.AI) are trained to distinguish between human and AI texts. They use various features of the text, including the above-mentioned methods, to learn to identify characteristic features that indicate AI writing.

It's important to remember:

The module is 94%+ accurate in finding text generated by GPT-3, GPT-3.5, GPT-4.о, GPT-Plus, GPT-Search and ChatGPT. However, it is not perfect, and errors are always possible.

It is much better, more correct and safer to analyse the series of author documents than to make a decision on a single document.

Document length matters - the longer the document, the more accurate the result.
The recommendable value of an AI Probability Coefficient is more than 60%. If the AIPC is above 80% and the SC is below 20%, the paper should be carefully analysed.