Overview

Dataset statistics

Number of variables7
Number of observations344
Missing cells19
Missing cells (%)0.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory18.9 KiB
Average record size in memory56.4 B

Variable types

Categorical3
Numeric4

Alerts

bill_depth_mm is highly overall correlated with flipper_length_mm and 2 other fieldsHigh correlation
bill_length_mm is highly overall correlated with body_mass_g and 3 other fieldsHigh correlation
body_mass_g is highly overall correlated with bill_length_mm and 3 other fieldsHigh correlation
flipper_length_mm is highly overall correlated with bill_depth_mm and 4 other fieldsHigh correlation
island is highly overall correlated with flipper_length_mm and 1 other fieldsHigh correlation
sex is highly overall correlated with bill_depth_mm and 2 other fieldsHigh correlation
species is highly overall correlated with bill_depth_mm and 4 other fieldsHigh correlation
sex has 11 (3.2%) missing valuesMissing

Reproduction

Analysis started2026-02-22 11:30:32.868705
Analysis finished2026-02-22 11:30:34.289502
Duration1.42 second
Software versionydata-profiling vv4.18.1
Download configurationconfig.json

Variables

species
Categorical

High correlation 

Distinct3
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
Adelie
152 
Gentoo
124 
Chinstrap
68 

Length

Max length9
Median length6
Mean length6.5930233
Min length6

Characters and Unicode

Total characters2268
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAdelie
2nd rowAdelie
3rd rowAdelie
4th rowAdelie
5th rowAdelie

Common Values

ValueCountFrequency (%)
Adelie152
44.2%
Gentoo124
36.0%
Chinstrap68
19.8%

Length

2026-02-22T11:30:34.338038image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-02-22T11:30:34.398047image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
adelie152
44.2%
gentoo124
36.0%
chinstrap68
19.8%

Most occurring characters

ValueCountFrequency (%)
e428
18.9%
o248
10.9%
i220
9.7%
n192
8.5%
t192
8.5%
A152
 
6.7%
d152
 
6.7%
l152
 
6.7%
G124
 
5.5%
C68
 
3.0%
Other values (5)340
15.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)2268
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e428
18.9%
o248
10.9%
i220
9.7%
n192
8.5%
t192
8.5%
A152
 
6.7%
d152
 
6.7%
l152
 
6.7%
G124
 
5.5%
C68
 
3.0%
Other values (5)340
15.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2268
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e428
18.9%
o248
10.9%
i220
9.7%
n192
8.5%
t192
8.5%
A152
 
6.7%
d152
 
6.7%
l152
 
6.7%
G124
 
5.5%
C68
 
3.0%
Other values (5)340
15.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2268
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e428
18.9%
o248
10.9%
i220
9.7%
n192
8.5%
t192
8.5%
A152
 
6.7%
d152
 
6.7%
l152
 
6.7%
G124
 
5.5%
C68
 
3.0%
Other values (5)340
15.0%

island
Categorical

High correlation 

Distinct3
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
Biscoe
168 
Dream
124 
Torgersen
52 

Length

Max length9
Median length6
Mean length6.0930233
Min length5

Characters and Unicode

Total characters2096
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTorgersen
2nd rowTorgersen
3rd rowTorgersen
4th rowTorgersen
5th rowTorgersen

Common Values

ValueCountFrequency (%)
Biscoe168
48.8%
Dream124
36.0%
Torgersen52
 
15.1%

Length

2026-02-22T11:30:34.463311image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-02-22T11:30:34.513887image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
biscoe168
48.8%
dream124
36.0%
torgersen52
 
15.1%

Most occurring characters

ValueCountFrequency (%)
e396
18.9%
r228
10.9%
s220
10.5%
o220
10.5%
B168
8.0%
i168
8.0%
c168
8.0%
D124
 
5.9%
a124
 
5.9%
m124
 
5.9%
Other values (3)156
 
7.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)2096
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e396
18.9%
r228
10.9%
s220
10.5%
o220
10.5%
B168
8.0%
i168
8.0%
c168
8.0%
D124
 
5.9%
a124
 
5.9%
m124
 
5.9%
Other values (3)156
 
7.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2096
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e396
18.9%
r228
10.9%
s220
10.5%
o220
10.5%
B168
8.0%
i168
8.0%
c168
8.0%
D124
 
5.9%
a124
 
5.9%
m124
 
5.9%
Other values (3)156
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2096
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e396
18.9%
r228
10.9%
s220
10.5%
o220
10.5%
B168
8.0%
i168
8.0%
c168
8.0%
D124
 
5.9%
a124
 
5.9%
m124
 
5.9%
Other values (3)156
 
7.4%

bill_length_mm
Real number (ℝ)

High correlation 

Distinct164
Distinct (%)48.0%
Missing2
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean43.92193
Minimum32.1
Maximum59.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.8 KiB
2026-02-22T11:30:34.581456image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum32.1
5-th percentile35.7
Q139.225
median44.45
Q348.5
95-th percentile51.995
Maximum59.6
Range27.5
Interquartile range (IQR)9.275

Descriptive statistics

Standard deviation5.4595837
Coefficient of variation (CV)0.124302
Kurtosis-0.87602697
Mean43.92193
Median Absolute Deviation (MAD)4.75
Skewness0.053118067
Sum15021.3
Variance29.807054
MonotonicityNot monotonic
2026-02-22T11:30:34.668536image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
41.17
 
2.0%
45.26
 
1.7%
45.55
 
1.5%
39.65
 
1.5%
50.55
 
1.5%
46.55
 
1.5%
505
 
1.5%
37.85
 
1.5%
46.25
 
1.5%
46.44
 
1.2%
Other values (154)290
84.3%
ValueCountFrequency (%)
32.11
0.3%
33.11
0.3%
33.51
0.3%
341
0.3%
34.11
0.3%
34.41
0.3%
34.51
0.3%
34.62
0.6%
352
0.6%
35.11
0.3%
ValueCountFrequency (%)
59.61
0.3%
581
0.3%
55.91
0.3%
55.81
0.3%
55.11
0.3%
54.31
0.3%
54.21
0.3%
53.51
0.3%
53.41
0.3%
52.81
0.3%

bill_depth_mm
Real number (ℝ)

High correlation 

Distinct80
Distinct (%)23.4%
Missing2
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean17.15117
Minimum13.1
Maximum21.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.8 KiB
2026-02-22T11:30:34.754633image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum13.1
5-th percentile13.9
Q115.6
median17.3
Q318.7
95-th percentile20
Maximum21.5
Range8.4
Interquartile range (IQR)3.1

Descriptive statistics

Standard deviation1.9747932
Coefficient of variation (CV)0.11514044
Kurtosis-0.90686609
Mean17.15117
Median Absolute Deviation (MAD)1.5
Skewness-0.14346463
Sum5865.7
Variance3.899808
MonotonicityNot monotonic
2026-02-22T11:30:34.833425image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1712
 
3.5%
1510
 
2.9%
18.610
 
2.9%
17.910
 
2.9%
18.510
 
2.9%
17.39
 
2.6%
18.99
 
2.6%
199
 
2.6%
17.89
 
2.6%
18.19
 
2.6%
Other values (70)245
71.2%
ValueCountFrequency (%)
13.11
 
0.3%
13.21
 
0.3%
13.31
 
0.3%
13.41
 
0.3%
13.52
 
0.6%
13.61
 
0.3%
13.76
1.7%
13.84
1.2%
13.94
1.2%
142
 
0.6%
ValueCountFrequency (%)
21.51
 
0.3%
21.22
0.6%
21.13
0.9%
20.81
 
0.3%
20.73
0.9%
20.61
 
0.3%
20.51
 
0.3%
20.33
0.9%
20.21
 
0.3%
20.11
 
0.3%

flipper_length_mm
Real number (ℝ)

High correlation 

Distinct55
Distinct (%)16.1%
Missing2
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean200.9152
Minimum172
Maximum231
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.8 KiB
2026-02-22T11:30:34.914394image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum172
5-th percentile181
Q1190
median197
Q3213
95-th percentile225
Maximum231
Range59
Interquartile range (IQR)23

Descriptive statistics

Standard deviation14.061714
Coefficient of variation (CV)0.0699883
Kurtosis-0.98427289
Mean200.9152
Median Absolute Deviation (MAD)11
Skewness0.34568183
Sum68713
Variance197.73179
MonotonicityNot monotonic
2026-02-22T11:30:34.998000image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19022
 
6.4%
19517
 
4.9%
18716
 
4.7%
19315
 
4.4%
21014
 
4.1%
19113
 
3.8%
21512
 
3.5%
19610
 
2.9%
19710
 
2.9%
1859
 
2.6%
Other values (45)204
59.3%
ValueCountFrequency (%)
1721
 
0.3%
1741
 
0.3%
1761
 
0.3%
1784
1.2%
1791
 
0.3%
1805
1.5%
1817
2.0%
1823
0.9%
1832
 
0.6%
1847
2.0%
ValueCountFrequency (%)
2311
 
0.3%
2307
2.0%
2292
 
0.6%
2284
1.2%
2261
 
0.3%
2254
1.2%
2243
0.9%
2232
 
0.6%
2226
1.7%
2215
1.5%

body_mass_g
Real number (ℝ)

High correlation 

Distinct94
Distinct (%)27.5%
Missing2
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean4201.7544
Minimum2700
Maximum6300
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.8 KiB
2026-02-22T11:30:35.080412image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum2700
5-th percentile3150
Q13550
median4050
Q34750
95-th percentile5650
Maximum6300
Range3600
Interquartile range (IQR)1200

Descriptive statistics

Standard deviation801.95454
Coefficient of variation (CV)0.19086183
Kurtosis-0.71922187
Mean4201.7544
Median Absolute Deviation (MAD)600
Skewness0.47032933
Sum1437000
Variance643131.08
MonotonicityNot monotonic
2026-02-22T11:30:35.174483image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
380012
 
3.5%
370011
 
3.2%
390010
 
2.9%
395010
 
2.9%
35509
 
2.6%
44008
 
2.3%
43008
 
2.3%
34508
 
2.3%
34008
 
2.3%
36007
 
2.0%
Other values (84)251
73.0%
ValueCountFrequency (%)
27001
 
0.3%
28502
0.6%
29004
1.2%
29251
 
0.3%
29751
 
0.3%
30002
0.6%
30504
1.2%
30751
 
0.3%
31001
 
0.3%
31504
1.2%
ValueCountFrequency (%)
63001
 
0.3%
60501
 
0.3%
60002
 
0.6%
59502
 
0.6%
58503
0.9%
58002
 
0.6%
57501
 
0.3%
57005
1.5%
56503
0.9%
56002
 
0.6%

sex
Categorical

High correlation  Missing 

Distinct2
Distinct (%)0.6%
Missing11
Missing (%)3.2%
Memory size2.8 KiB
Male
168 
Female
165 

Length

Max length6
Median length4
Mean length4.990991
Min length4

Characters and Unicode

Total characters1662
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMale
2nd rowFemale
3rd rowFemale
4th rowFemale
5th rowMale

Common Values

ValueCountFrequency (%)
Male168
48.8%
Female165
48.0%
(Missing)11
 
3.2%

Length

2026-02-22T11:30:35.263579image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-02-22T11:30:35.312135image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
male168
50.5%
female165
49.5%

Most occurring characters

ValueCountFrequency (%)
e498
30.0%
a333
20.0%
l333
20.0%
M168
 
10.1%
F165
 
9.9%
m165
 
9.9%

Most occurring categories

ValueCountFrequency (%)
(unknown)1662
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e498
30.0%
a333
20.0%
l333
20.0%
M168
 
10.1%
F165
 
9.9%
m165
 
9.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1662
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e498
30.0%
a333
20.0%
l333
20.0%
M168
 
10.1%
F165
 
9.9%
m165
 
9.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1662
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e498
30.0%
a333
20.0%
l333
20.0%
M168
 
10.1%
F165
 
9.9%
m165
 
9.9%

Interactions

2026-02-22T11:30:33.802051image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-22T11:30:33.008047image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-22T11:30:33.246825image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-22T11:30:33.466830image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-22T11:30:33.862670image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-22T11:30:33.071599image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-22T11:30:33.303356image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-22T11:30:33.528730image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-22T11:30:33.922225image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-22T11:30:33.125237image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-22T11:30:33.352911image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-22T11:30:33.672838image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-22T11:30:33.986774image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-22T11:30:33.186789image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-22T11:30:33.411274image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-22T11:30:33.735819image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2026-02-22T11:30:35.349184image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
bill_depth_mmbill_length_mmbody_mass_gflipper_length_mmislandsexspecies
bill_depth_mm1.000-0.222-0.432-0.5230.4840.5860.635
bill_length_mm-0.2221.0000.5840.6730.3240.5200.650
body_mass_g-0.4320.5841.0000.8400.4560.5890.605
flipper_length_mm-0.5230.6730.8401.0000.5010.4480.701
island0.4840.3240.4560.5011.0000.0000.657
sex0.5860.5200.5890.4480.0001.0000.000
species0.6350.6500.6050.7010.6570.0001.000

Missing values

2026-02-22T11:30:34.077850image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2026-02-22T11:30:34.141396image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2026-02-22T11:30:34.234468image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsex
0AdelieTorgersen39.118.7181.03750.0Male
1AdelieTorgersen39.517.4186.03800.0Female
2AdelieTorgersen40.318.0195.03250.0Female
3AdelieTorgersenNaNNaNNaNNaNNaN
4AdelieTorgersen36.719.3193.03450.0Female
5AdelieTorgersen39.320.6190.03650.0Male
6AdelieTorgersen38.917.8181.03625.0Female
7AdelieTorgersen39.219.6195.04675.0Male
8AdelieTorgersen34.118.1193.03475.0NaN
9AdelieTorgersen42.020.2190.04250.0NaN
speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsex
334GentooBiscoe46.214.1217.04375.0Female
335GentooBiscoe55.116.0230.05850.0Male
336GentooBiscoe44.515.7217.04875.0NaN
337GentooBiscoe48.816.2222.06000.0Male
338GentooBiscoe47.213.7214.04925.0Female
339GentooBiscoeNaNNaNNaNNaNNaN
340GentooBiscoe46.814.3215.04850.0Female
341GentooBiscoe50.415.7222.05750.0Male
342GentooBiscoe45.214.8212.05200.0Female
343GentooBiscoe49.916.1213.05400.0Male