Thursday, June 30, 2022
Techiexpert.com
No Result
View All Result
  • Login
  • Register
Exclusive Videos
  • Tech news
  • Startup news
  • Artificial Intelligence
  • IOT
  • Big Data
  • Cloud
  • Data Analytics
  • Machine Learning
  • Blockchain
  • Social Media
  • Tech news
  • Startup news
  • Artificial Intelligence
  • IOT
  • Big Data
  • Cloud
  • Data Analytics
  • Machine Learning
  • Blockchain
  • Social Media
No Result
View All Result
TechiExpert
No Result
View All Result

How to Detect Outliers in Your Company’s Data

Anodot by Anodot
January 31, 2020
in Data Analytics
Reading Time: 6 mins read
How to Detect Outliers in Your Company’s Data
13
SHARES
166
VIEWS
Share on FacebookShare on Twitter

One of the greatest challenges in business is dealing with anomalies in the data that come across your desk. According to the Engineering Statistics Handbook, “An outlier is an observation that lies at an abnormal distance from other values in a random sample from a population.” In order for a business to be successful, its principals need to understand any deviation from what constitutes the norm — from “business as usual,” so to speak.

Suppose you own a company called Trendline Sunglasses and you’ve just discovered a spike in overall sales of your top product. This could be a great business opportunity, a reflection of a successful marketing campaign that’s led to this spike, which could mean greater profitability and the need to reallocate funds. But a spike in sales could also indicate a problem, such as a pricing glitch that’s caused a run on an item which is leading to lost revenue.

The point here is that outliers must be investigated thoroughly since they often hold valuable—even critical—information about a process.

Types of outliers

1.     Global outliers (point anomalies)

Point anomalies are single data points that are termed global because they lie far outside the distribution of the data set as a whole. 

A business customer who typically deposits $10,000 in the bank each Friday suddenly depositing $50,000 on two consecutive Fridays would be a global anomaly since this stands outside this customer’s history. A time series line chart of their activity would show a hockey stick uptick in their activity, which would likely alert the federal authorities to the implication of illicit activity.

2.     Contextual (conditional) outliers

A data point is considered contextual if it deviates from data of the same kind. In textual data, this could be punctuation among letters; in speech recognition, background noise. 

An example of a contextual outlier might be the aforementioned sudden surge in sales volume at the sunglasses company if this falls outside of a promotion. Might this surge be due to a price glitch?

3.     Collective outliers

A subset of data points within a larger data set is considered anomalous if their values deviate collectively and significantly from that of the larger data set.

 It is axiomatic that stock prices of publicly traded companies fluctuate. It’s why we hire people to manage our portfolios for us: so that we don’t have to watch whether our stocks are up or down. But if a stock stayed at the exact same price (to the penny) for a long period of time…well, that would be a collective outlier. Such an event seemed to have happened due to a computer glitch in 2017 and several tech stocks—including Apple and Amazon—were listed at $123.45 for a very long time.

Within each of these categories, you can find examples of univariate and multivariate anomalies. Univariate anomalies are outliers on one variable; multivariate anomalies are outliers on at least two. Both types can influence outcomes in statistical analysis.

Time series data & analysis

A time series consists of a succession of data points taken from measurements over time. Some examples of time series are ocean tide measurements, counts of sunspots, stock market values, and measurements of weather activity. Visualizations of time series data are typically done with line charts.

Common business applications of time series analysis include webpage views over time, active app users, sales by platform, time on site and numbers of transactions over time. You see how valuable time series can be within a business context.

Detection methods for time series data

Univariate outlier (anomaly) detection

Univariate time series measures one variable over time. For example, data might be collected on the number of visitors on a specific webpage every quarter-hour. This will give you a one-dimensional value every fifteen minutes. Univariate anomaly detection will focus on this one specific metric.

An advantage of univariate modeling is that it allows you to hone in on specific processes, to see them fully. But this is also its disadvantage because by focusing solely on one metric, it potentially prevents you from realizing problems elsewhere on other metrics.

Multivariate outlier (anomaly) detection

Multivariate outlier detection refers to processes for detecting anomalies in two or more variables in time series data. An advantage of multivariable detection is that it seeks to detect outliers as complete incidents and learn a single model for all of the data metrics. But like Maslow’s concept that “if you have a hammer, everything looks like a nail”, not all problems are appropriate to the heavy-handedness of multivariate models. Some problems require the focused intensity of univariate modeling.

Hybrid approach

A hybrid approach combines both univariate and multivariate anomaly detection and it is superior to either as a standalone in most cases. This is because both types of detection are complementary to one another and each is necessary for conciseness. It’s the ideal approach in that analyzing with univariate modeling gives you a focused view of individual metrics while multivariate modeling offers grouping and deciphering of related anomalies. Using both together is the only way of getting a complete picture.

Automated real-time anomaly detection for business metrics

Businesses today have thousands—often millions—of metrics to track. It is literally financially infeasible to hire the number of analysts necessary to track all the anomalies that occur.  As a result today’s companies all too often settle for creating high-level dashboards that aren’t nearly as refined as they need them to be. With too few analysts and dashboards that have coarse filters, catching all the anomalies becomes a matter of luck. The only way to solve the problem is to adopt an automated approach to anomaly detection.

A reliable automated real-time anomaly detection system should use sophisticated, intelligent detection methods so as to detect all types of outliers—global, contextual, collective—and to understand the relationships between different data sets.

This is best achieved through:

  1. A hybrid detection approach
  2. The implementation of appropriate models and distributions for each time series (stationary, non-stationary, irregularly sampled, discrete, etc.)
  3. The consideration of seasonal and trend patterns

Conclusion

When is a spike in sales a sign of a good thing and when is it a sign of a problem? This and more is what outlier detection can tell you. Anomaly detection is crucial for understanding how to separate signal and noise when it comes to business data. Real-time automated anomaly detection helps pinpoint outliers in the millions of metrics generated each year.

Tags: Data sets
Share5Tweet3Share1Pin2

Related Posts

Cyber security trends to watch in 2022
Data Analytics

Cyber security trends to watch in 2022

Review on google data analytics certificate program
Data Analytics

Review on Google Data Analytics Professional Certificate

How to Successfully do a Data Quality Assessment?
Data Analytics

How to Successfully do a Data Quality Assessment

How can Analytics impact everyday life
Data Analytics

How can Analytics impact everyday life?

How Analytics infusion helps in data-driven operations
Data Analytics

How Analytics infusion helps in data driven operations

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Most Read

  • How to Track Someone’s iPhone by Phone Number?

    How to Track Someone’s iPhone by Phone Number?

    329 shares
    Share 132 Tweet 82
  • Top 5 car automation trends to know

    152 shares
    Share 61 Tweet 38
  • What is windows modules installer ? How to Enable/Disable

    1234 shares
    Share 494 Tweet 309
  • Is Parody Coin investment a Good Investment?

    64 shares
    Share 26 Tweet 16
  • Tips to Reduce Your Website Hosting Costs

    871 shares
    Share 348 Tweet 218
  • How to Track Activities an Instagram account?

    81 shares
    Share 32 Tweet 20

Recent Stories

Does domain extensions impact SEO standards

Does domain extension impact SEO standards
Share4Tweet3Share1Pin1

Ways Data Analysis has changed customer reward programs

Ways Data Analysis has changed customer reward programs
Share4Tweet3Share1Pin1

Here’s why Deep Learning might not be enough for celebrity face recognition

Digital Learning face recognition
Share5Tweet3Share1Pin3

How Enterprise Blockchain can enable Privacy Preservation

How Enterprise Blockchain can enable Privacy Preservation
Share5Tweet3Share1Pin1
  • Terms of use
  • Privacy Policy
  • About Us
  • Contact us
  • Write For Us
  • Cookie Policy

© 2022 All Rights Reserved

No Result
View All Result
  • Tech news
  • Startup news
  • Artificial Intelligence
  • IOT
  • Big Data
  • Cloud
  • Data Analytics
  • Machine Learning
  • Blockchain
  • Social Media

© 2022 All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms below to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Cookie Law Notice
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT