๐Ÿ“‰

Find Statistical Outliers in CSV and Excel Data

Upload any CSV or Excel file and instantly detect outliers in every numeric column using the 3-sigma rule. See which values are more than 3 standard deviations from the mean the standard statistical definition of an outlier.

๐Ÿ“ 3-sigma rule detection
๐Ÿ“Š Distribution charts
๐Ÿ”ข Mean and std dev shown
๐Ÿ”’ No upload to server
๐Ÿ“Š

CSV & Excel Data Analyzer

Find duplicates, nulls & errors ยท Clean & export ยท Auto dashboard

โˆž All Rows๐Ÿ”’ No Uploadโšก Free
๐Ÿ” Duplicatesโฌœ Nullsโšก Type Check๐Ÿ“Š Stats๐Ÿ“‰ Outliers๐Ÿ’ก Insights๐Ÿ”ง Clean๐Ÿ“ˆ Dashboardโฌ‡๏ธ Export๐Ÿ“ก API
๐Ÿ“Š

Drop your data file here

or click to browse CSV, Excel (.xlsx / .xls) or JSON

CSVXLSXXLSJSON

Free CSV and Excel Data Analyzer Find Duplicates, Nulls and Errors Instantly

Upload any CSV or Excel file to instantly find duplicate rows, null values, type mismatches and data quality issues. The Clean Data tab lets you remove duplicates, fill nulls and standardize headers in one click, then download the cleaned file. The Dashboard tab auto-generates charts from your data. No Python, no SQL, no formulas required.

How do I find and remove duplicate rows in CSV?
Upload your file, then go to the Clean Data tab. Click "Remove Duplicate Rows" to deduplicate instantly. The row count updates and you can download the cleaned file as CSV, Excel or JSON.
Can I fill null values instead of deleting rows?
Yes. In the Clean Data tab, choose "Fill Nulls" with options to fill with 0, with the column mean, or with an empty string. This lets you keep all rows while fixing missing values.
What does the Dashboard tab show?
The Dashboard tab auto-generates bar charts for numeric columns showing value distribution, and donut charts for categorical columns showing the most frequent values. It gives you an instant visual overview without any configuration.
Does this tool process all rows?
Yes. There is no row limit. All rows are processed in your browser. Your files never leave your device.

What Is the 3-Sigma Rule for Outlier Detection?

The 3-sigma rule, also called the empirical rule or the 68-95-99.7 rule, is the standard statistical method for identifying unusual values in a dataset. In a normal distribution, 68% of values fall within 1 standard deviation of the mean, 95% fall within 2, and 99.7% fall within 3. Any value outside 3 standard deviations is therefore in the most extreme 0.3% of the distribution and is considered a statistical outlier.

Example: Sales Amount Column
Mean$4,500
Standard Deviation$800
Normal Range (ยฑ3ฯƒ)$2,100 to $6,900
Flagged as OutlierBelow $2,100 or above $6,900

Four Common Causes of Outliers in Business Data

โŒจ๏ธ
Data Entry Errors

A person typed 100000 instead of 10000. Or a form accepted scientific notation like 1e5 which appears as a massive number. Or a copy-paste duplicated a digit. These are the most common outliers in manually entered spreadsheets and they are genuinely wrong values that should be corrected.

๐Ÿ–ฅ๏ธ
System or Sensor Errors

An API returned -1 as a default for a missing reading. A sensor produced a null reading that got stored as 0 or 9999. A database default value of 99999 was used for unknown records. These are not real data points and should be treated as missing values, not extreme but valid ones.

โœ…
Legitimate Extreme Values

A VIP customer who placed a $500,000 order while every other order is under $10,000. A transaction on Black Friday that is 20x the daily average. These are genuine values that accurately represent reality and should not be removed they are the signal, not the noise.

๐Ÿงช
Test or Staging Data

Records inserted during testing that have amounts like 12345.67 or dates in 1970 or 2099. These should be filtered out before any production analysis. Outlier detection often catches them because test values are chosen to be distinctive rather than realistic.

Frequently Asked Questions

How do I find outliers in a CSV file?
Upload your CSV file to this tool. After analysis, the Issues tab shows an Outliers section listing every numeric column that has values more than 3 standard deviations from the mean, with the outlier count and the column mean. The Columns tab shows a distribution histogram for each numeric column so you can visually see where extreme values appear.
Can I use this for Excel files as well as CSV?
Yes. Upload .xlsx or .xls files directly. The tool reads all rows from the first sheet and runs the same outlier detection as for CSV files, including distribution histograms and statistics per numeric column.
What if my data is not normally distributed?
The 3-sigma rule works best for roughly normal distributions. For highly skewed data, such as income or transaction amounts where most values are small but a few are very large, the 3-sigma rule may flag many legitimate high values as outliers. In those cases, consider transforming your data (such as using log scale) before analysis, or treat the flagged values as candidates for investigation rather than definitive outliers.
How do I remove outlier rows from my dataset?
This tool detects outliers but does not currently have a one-click outlier removal button, because outliers should be investigated before being removed. You can export the data using the Download button in the Data Table tab, then filter and remove the specific rows manually. For bulk operations on known outlier thresholds, the CSV to SQL tool can help you write a query to filter them.
What other data quality issues does this tool detect besides outliers?
The full analysis includes null and missing values per column, exact duplicate rows, data type mismatches (text in numeric columns), duplicate columns with identical data, and a data quality score from 0 to 100. All of these are shown in the same analysis when you upload your file.