Our Methodology

NS8 uses many different detection methods to determine whether the user is valid or not. Some of these methods are algorithmic, and others are learned over time by detecting patterns in the data. Not all methods will be listed in order to protect our intellectual property and to prevent reverse-engineering.

Scoring

Some methods are fairly conclusive about whether the user is valid or not. Other methods produce a likelihood of validity, which we show as a score from 0-1000. The lower the score, the more likely the user is invalid. For example, a user proxying in through a data center's I.P. address is highly unlikely to be a valid user. On the other hand, certain countries originate the bulk of invalid traffic, but also have real users.

Scores will generally be between zero and about 500. A score below 100 is considered 'invalid', meaning there is almost a certainty that the user is not real. The upper end of the scoring range will be used in the future for whitelisting methods.

Bots

A large and growing percentage of web traffic is generated by bots, spiders, extensions, headless browsers, toolbars and other means (collectively called “bots”). These bots have become increasingly sophisticated in how they disguise themselves, which means that fraud detection systems must continuously evolve their detection methods to be able to block malicious traffic.

Here are some of the indicators that we check to assess a potential bot:

Block List

We check every I.P. address against our database of known infected machines. This tool detects machines that have been hijacked as spambots, as well as machines that are infected with viruses and generate large amounts of automated traffic or clicks. This database is maintained in real-time to detect emerging sources and keep up to date on the latest trends.

Data Center Origin

We maintain a database of data center I.P. address ranges, since many bot networks will use data centers to create or proxy traffic. For example, a session from within Amazon Web Services' data center address block is unlikely to be valid.

Public Web Proxies

Public web proxies are also used to hide a user’s location by their I.P. address appearing to come from somewhere other than their real location, much like proxying through a data center above. We maintain a real-time database of public web proxies so we can detect sessions from them and score users accordingly.

TOR

TOR is a free to use software that enables anonymous online communication. TOR has legitimate uses, but as it hides the origin of the user, it is inherently suspicious and can be used to generate random sessions.

Spoofed User Agents

Bots often rotate their user agents to appear to be multiple devices and generate realistic looking traffic. We have developed technology to match the user agent to the browser’s capabilities and detect sessions that have altered their user agent.

Invalid Searches

Bots often create fake referrer headers to appear to be from a search engine. In many cases, these headers differ from real search engine referrer structures.

Collusion

This method detects the coincidence of a set of I.P. addresses and a set of publisher sites.

Other Proprietary Methods

We have developed several other methods for detecting fraudulent sessions and this continues to be a primary focus of our research efforts.

Hidden Users

Hidden users are from sessions where no page is ever visible on the screen. Whether they are a bot or just a human user that never looks at their display, a hidden session will receive an EQ8 Score of 0 because no page content was ever looked at by a real person.

Here are some of the primary reasons that a user may be categorized as a hidden session:

Preloading

Search engines will preload pages in the background while a user types in a search query. The search engine attempts to predict which link or links the user will click on and then loads the pages from those links. This is a way to improve the performance of web browsing; however, many of the preloaded pages are never made visible and should not be counted as evidence of real site activity.

Browser Window Hidden

This occurs when a browser window is behind another window, so web content can’t be seen by the human user.

Background Browser Tabs

A browser tab can be launched in the background and load pages. These pages are never visible unless the user opens the tab. 

Bots

The session is detected as a bot and not a real person. This is usually the default reason unless the user falls under one of the categories above.

Our technology tracks whether a session is ever viewed and updates the visibility based on that. For example, if a page is hidden during a pre-load, it is initially recorded as hidden and given a score of zero. If the user clicks on the link to view the preloaded page, that is detected and the session is updated with a new score.

Each session is scored and reports all have options to include or exclude users based on score. For example, you may want to view campaigns where the score is less than 100. This would show you the campaigns that are referring the worst quality users.

Did this answer your question?