Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 2988181 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Total size in memory | 1.7 GiB |
| Average record size in memory | 597.6 B |
Variable types
| Categorical | 9 |
|---|---|
| DateTime | 2 |
| Numeric | 1 |
user_id has a high cardinality: 322897 distinct values | High cardinality |
session_id has a high cardinality: 1048594 distinct values | High cardinality |
click_article_id has a high cardinality: 46033 distinct values | High cardinality |
Reproduction
| Analysis started | 2022-05-07 17:29:08.631589 |
|---|---|
| Analysis finished | 2022-05-07 17:29:32.735522 |
| Duration | 24.1 seconds |
| Software version | pandas-profiling v3.2.0 |
| Download configuration | config.json |
| Distinct | 322897 |
|---|---|
| Distinct (%) | 10.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 177.7 MiB |
| 5890 | 1232 |
|---|---|
| 73574 | 939 |
| 15867 | 900 |
| 80350 | 783 |
| 15275 | 746 |
| Other values (322892) |
Characters and Unicode
| Total characters | 16019411 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 2 |
Common Values
| Value | Count | Frequency (%) |
| 5890 | 1232 | < 0.1% |
| 73574 | 939 | < 0.1% |
| 15867 | 900 | < 0.1% |
| 80350 | 783 | < 0.1% |
| 15275 | 746 | < 0.1% |
| 2151 | 722 | < 0.1% |
| 4568 | 529 | < 0.1% |
| 12897 | 513 | < 0.1% |
| 11521 | 502 | < 0.1% |
| 34541 | 501 | < 0.1% |
| Other values (322887) | 2980814 |
| Value | Count | Frequency (%) |
| 5890 | 1232 | < 0.1% |
| 73574 | 939 | < 0.1% |
| 15867 | 900 | < 0.1% |
| 80350 | 783 | < 0.1% |
| 15275 | 746 | < 0.1% |
| 2151 | 722 | < 0.1% |
| 4568 | 529 | < 0.1% |
| 12897 | 513 | < 0.1% |
| 11521 | 502 | < 0.1% |
| 34541 | 501 | < 0.1% |
| Other values (322887) | 2980814 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 2406584 | |
| 2 | 1986207 | |
| 3 | 1547037 | |
| 5 | 1517002 | |
| 4 | 1508460 | |
| 6 | 1456353 | |
| 7 | 1433310 | |
| 8 | 1410350 | |
| 9 | 1392458 | |
| 0 | 1361650 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 16019411 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 2406584 | |
| 2 | 1986207 | |
| 3 | 1547037 | |
| 5 | 1517002 | |
| 4 | 1508460 | |
| 6 | 1456353 | |
| 7 | 1433310 | |
| 8 | 1410350 | |
| 9 | 1392458 | |
| 0 | 1361650 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 16019411 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 2406584 | |
| 2 | 1986207 | |
| 3 | 1547037 | |
| 5 | 1517002 | |
| 4 | 1508460 | |
| 6 | 1456353 | |
| 7 | 1433310 | |
| 8 | 1410350 | |
| 9 | 1392458 | |
| 0 | 1361650 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 16019411 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 2406584 | |
| 2 | 1986207 | |
| 3 | 1547037 | |
| 5 | 1517002 | |
| 4 | 1508460 | |
| 6 | 1456353 | |
| 7 | 1433310 | |
| 8 | 1410350 | |
| 9 | 1392458 | |
| 0 | 1361650 |
| Distinct | 1048594 |
|---|---|
| Distinct (%) | 35.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 208.0 MiB |
| 1507563657895091 | 124 |
|---|---|
| 1507896573228093 | 107 |
| 1507133567968022 | 106 |
| 1507309773225261 | 98 |
| 1508112331270612 | 94 |
| Other values (1048589) |
Characters and Unicode
| Total characters | 47810896 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1506825423271737 |
|---|---|
| 2nd row | 1506825423271737 |
| 3rd row | 1506825426267738 |
| 4th row | 1506825426267738 |
| 5th row | 1506825435299739 |
Common Values
| Value | Count | Frequency (%) |
| 1507563657895091 | 124 | < 0.1% |
| 1507896573228093 | 107 | < 0.1% |
| 1507133567968022 | 106 | < 0.1% |
| 1507309773225261 | 98 | < 0.1% |
| 1508112331270612 | 94 | < 0.1% |
| 1507647366292530 | 92 | < 0.1% |
| 1507475403662486 | 86 | < 0.1% |
| 1506959499272114 | 82 | < 0.1% |
| 1508154737228813 | 79 | < 0.1% |
| 1506999909218419 | 75 | < 0.1% |
| Other values (1048584) | 2987238 |
| Value | Count | Frequency (%) |
| 1507563657895091 | 124 | < 0.1% |
| 1507896573228093 | 107 | < 0.1% |
| 1507133567968022 | 106 | < 0.1% |
| 1507309773225261 | 98 | < 0.1% |
| 1508112331270612 | 94 | < 0.1% |
| 1507647366292530 | 92 | < 0.1% |
| 1507475403662486 | 86 | < 0.1% |
| 1506959499272114 | 82 | < 0.1% |
| 1508154737228813 | 79 | < 0.1% |
| 1506999909218419 | 75 | < 0.1% |
| Other values (1048584) | 2987238 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 7222437 | |
| 5 | 6370248 | |
| 0 | 6306506 | |
| 7 | 5505572 | |
| 2 | 4058812 | |
| 3 | 3977203 | |
| 6 | 3794560 | |
| 8 | 3596989 | |
| 9 | 3536107 | |
| 4 | 3442462 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 47810896 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 7222437 | |
| 5 | 6370248 | |
| 0 | 6306506 | |
| 7 | 5505572 | |
| 2 | 4058812 | |
| 3 | 3977203 | |
| 6 | 3794560 | |
| 8 | 3596989 | |
| 9 | 3536107 | |
| 4 | 3442462 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 47810896 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 7222437 | |
| 5 | 6370248 | |
| 0 | 6306506 | |
| 7 | 5505572 | |
| 2 | 4058812 | |
| 3 | 3977203 | |
| 6 | 3794560 | |
| 8 | 3596989 | |
| 9 | 3536107 | |
| 4 | 3442462 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 47810896 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 7222437 | |
| 5 | 6370248 | |
| 0 | 6306506 | |
| 7 | 5505572 | |
| 2 | 4058812 | |
| 3 | 3977203 | |
| 6 | 3794560 | |
| 8 | 3596989 | |
| 9 | 3536107 | |
| 4 | 3442462 |
session_start
Date
| Distinct | 646874 |
|---|---|
| Distinct (%) | 21.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.8 MiB |
| Minimum | 2017-10-01 04:37:03 |
|---|---|
| Maximum | 2017-10-17 05:36:19 |
Histogram with fixed size bins (bins=50)
session_size
Real number (ℝ≥0)
| Distinct | 72 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.901885127 |
| Minimum | 2 |
|---|---|
| Maximum | 124 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 2 |
| median | 3 |
| Q3 | 4 |
| 95-th percentile | 9 |
| Maximum | 124 |
| Range | 122 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 3.929941495 |
|---|---|
| Coefficient of variation (CV) | 1.007190465 |
| Kurtosis | 158.4608899 |
| Mean | 3.901885127 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 9.090074854 |
| Sum | 11659539 |
| Variance | 15.44444016 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 2 | 1260372 | |
| 3 | 670185 | |
| 4 | 374240 | 12.5% |
| 5 | 220105 | 7.4% |
| 6 | 135762 | 4.5% |
| 7 | 88354 | 3.0% |
| 8 | 58544 | 2.0% |
| 9 | 40878 | 1.4% |
| 10 | 29530 | 1.0% |
| 11 | 21714 | 0.7% |
| Other values (62) | 88497 | 3.0% |
| Value | Count | Frequency (%) |
| 2 | 1260372 | |
| 3 | 670185 | |
| 4 | 374240 | 12.5% |
| 5 | 220105 | 7.4% |
| 6 | 135762 | 4.5% |
| 7 | 88354 | 3.0% |
| 8 | 58544 | 2.0% |
| 9 | 40878 | 1.4% |
| 10 | 29530 | 1.0% |
| 11 | 21714 | 0.7% |
| Value | Count | Frequency (%) |
| 124 | 124 | |
| 107 | 107 | |
| 106 | 106 | |
| 98 | 98 | |
| 94 | 94 | |
| 92 | 92 | |
| 86 | 86 | |
| 82 | 82 | |
| 79 | 79 | |
| 75 | 75 |
| Distinct | 46033 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 179.0 MiB |
| 160974 | 37213 |
|---|---|
| 272143 | 28943 |
| 336221 | 23851 |
| 234698 | 23499 |
| 123909 | 23122 |
| Other values (46028) |
Characters and Unicode
| Total characters | 17347006 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 24811 ? |
|---|---|
| Unique (%) | 0.8% |
Sample
| 1st row | 157541 |
|---|---|
| 2nd row | 68866 |
| 3rd row | 235840 |
| 4th row | 96663 |
| 5th row | 119592 |
Common Values
| Value | Count | Frequency (%) |
| 160974 | 37213 | 1.2% |
| 272143 | 28943 | 1.0% |
| 336221 | 23851 | 0.8% |
| 234698 | 23499 | 0.8% |
| 123909 | 23122 | 0.8% |
| 336223 | 21855 | 0.7% |
| 96210 | 21577 | 0.7% |
| 162655 | 21062 | 0.7% |
| 183176 | 20303 | 0.7% |
| 168623 | 19526 | 0.7% |
| Other values (46023) | 2747230 |
| Value | Count | Frequency (%) |
| 160974 | 37213 | 1.2% |
| 272143 | 28943 | 1.0% |
| 336221 | 23851 | 0.8% |
| 234698 | 23499 | 0.8% |
| 123909 | 23122 | 0.8% |
| 336223 | 21855 | 0.7% |
| 96210 | 21577 | 0.7% |
| 162655 | 21062 | 0.7% |
| 183176 | 20303 | 0.7% |
| 168623 | 19526 | 0.7% |
| Other values (46023) | 2747230 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 2669004 | |
| 1 | 2322402 | |
| 3 | 2172869 | |
| 6 | 1692346 | |
| 5 | 1494065 | |
| 0 | 1440544 | |
| 8 | 1433872 | |
| 4 | 1406484 | |
| 9 | 1401337 | |
| 7 | 1314083 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 17347006 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 2669004 | |
| 1 | 2322402 | |
| 3 | 2172869 | |
| 6 | 1692346 | |
| 5 | 1494065 | |
| 0 | 1440544 | |
| 8 | 1433872 | |
| 4 | 1406484 | |
| 9 | 1401337 | |
| 7 | 1314083 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 17347006 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 2669004 | |
| 1 | 2322402 | |
| 3 | 2172869 | |
| 6 | 1692346 | |
| 5 | 1494065 | |
| 0 | 1440544 | |
| 8 | 1433872 | |
| 4 | 1406484 | |
| 9 | 1401337 | |
| 7 | 1314083 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 17347006 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 2669004 | |
| 1 | 2322402 | |
| 3 | 2172869 | |
| 6 | 1692346 | |
| 5 | 1494065 | |
| 0 | 1440544 | |
| 8 | 1433872 | |
| 4 | 1406484 | |
| 9 | 1401337 | |
| 7 | 1314083 |
click_timestamp
Date
| Distinct | 1016184 |
|---|---|
| Distinct (%) | 34.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.8 MiB |
| Minimum | 2017-10-01 05:00:00 |
|---|---|
| Maximum | 2017-11-13 21:04:14 |
Histogram with fixed size bins (bins=50)
click_environment
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 183.0 MiB |
| 4 - Web | |
|---|---|
| 2 - Mobile App | 79743 |
| 1 - Facebook Instant Article | 3960 |
Characters and Unicode
| Total characters | 21558628 |
|---|---|
| Distinct characters | 23 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 4 - Web |
|---|---|
| 2nd row | 4 - Web |
| 3rd row | 4 - Web |
| 4th row | 4 - Web |
| 5th row | 4 - Web |
Common Values
| Value | Count | Frequency (%) |
| 4 - Web | 2904478 | |
| 2 - Mobile App | 79743 | 2.7% |
| 1 - Facebook Instant Article | 3960 | 0.1% |
Category Frequency Plot
| Value | Count | Frequency (%) |
| 2988181 | ||
| 4 | 2904478 | |
| web | 2904478 | |
| 2 | 79743 | 0.9% |
| mobile | 79743 | 0.9% |
| app | 79743 | 0.9% |
| 1 | 3960 | < 0.1% |
| 3960 | < 0.1% | |
| instant | 3960 | < 0.1% |
| article | 3960 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 6064025 | ||
| e | 2992141 | |
| - | 2988181 | |
| b | 2988181 | |
| 4 | 2904478 | |
| W | 2904478 | |
| p | 159486 | 0.7% |
| o | 87663 | 0.4% |
| l | 83703 | 0.4% |
| A | 83703 | 0.4% |
| Other values (13) | 302589 | 1.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 6442397 | |
| Space Separator | 6064025 | |
| Uppercase Letter | 3075844 | |
| Dash Punctuation | 2988181 | |
| Decimal Number | 2988181 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 2992141 | |
| b | 2988181 | |
| p | 159486 | 2.5% |
| o | 87663 | 1.4% |
| l | 83703 | 1.3% |
| i | 83703 | 1.3% |
| t | 11880 | 0.2% |
| a | 7920 | 0.1% |
| c | 7920 | 0.1% |
| n | 7920 | 0.1% |
| Other values (3) | 11880 | 0.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| W | 2904478 | |
| A | 83703 | 2.7% |
| M | 79743 | 2.6% |
| F | 3960 | 0.1% |
| I | 3960 | 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 2904478 | |
| 2 | 79743 | 2.7% |
| 1 | 3960 | 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 6064025 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2988181 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 12040387 | |
| Latin | 9518241 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 2992141 | |
| b | 2988181 | |
| W | 2904478 | |
| p | 159486 | 1.7% |
| o | 87663 | 0.9% |
| l | 83703 | 0.9% |
| A | 83703 | 0.9% |
| i | 83703 | 0.9% |
| M | 79743 | 0.8% |
| t | 11880 | 0.1% |
| Other values (8) | 43560 | 0.5% |
Common
| Value | Count | Frequency (%) |
| 6064025 | ||
| - | 2988181 | |
| 4 | 2904478 | |
| 2 | 79743 | 0.7% |
| 1 | 3960 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 21558628 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6064025 | ||
| e | 2992141 | |
| - | 2988181 | |
| b | 2988181 | |
| 4 | 2904478 | |
| W | 2904478 | |
| p | 159486 | 0.7% |
| o | 87663 | 0.4% |
| l | 83703 | 0.4% |
| A | 83703 | 0.4% |
| Other values (13) | 302589 | 1.4% |
click_deviceGroup
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 189.9 MiB |
| 1 - Tablet | |
|---|---|
| 3 - Empty | |
| 4 - Mobile | 117640 |
| 5 - Desktop | 283 |
| 2 - TV | 10 |
Characters and Unicode
| Total characters | 28834967 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 3 - Empty |
|---|---|
| 2nd row | 3 - Empty |
| 3rd row | 1 - Tablet |
| 4th row | 1 - Tablet |
| 5th row | 1 - Tablet |
Common Values
| Value | Count | Frequency (%) |
| 1 - Tablet | 1823162 | |
| 3 - Empty | 1047086 | |
| 4 - Mobile | 117640 | 3.9% |
| 5 - Desktop | 283 | < 0.1% |
| 2 - TV | 10 | < 0.1% |
Category Frequency Plot
| Value | Count | Frequency (%) |
| 2988181 | ||
| 1 | 1823162 | |
| tablet | 1823162 | |
| 3 | 1047086 | 11.7% |
| empty | 1047086 | 11.7% |
| 4 | 117640 | 1.3% |
| mobile | 117640 | 1.3% |
| 5 | 283 | < 0.1% |
| desktop | 283 | < 0.1% |
| 2 | 10 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 5976362 | ||
| - | 2988181 | |
| t | 2870531 | |
| e | 1941085 | 6.7% |
| b | 1940802 | 6.7% |
| l | 1940802 | 6.7% |
| T | 1823172 | 6.3% |
| 1 | 1823162 | 6.3% |
| a | 1823162 | 6.3% |
| p | 1047369 | 3.6% |
| Other values (14) | 4660339 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 13894052 | |
| Space Separator | 5976362 | |
| Uppercase Letter | 2988191 | 10.4% |
| Dash Punctuation | 2988181 | 10.4% |
| Decimal Number | 2988181 | 10.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 2870531 | |
| e | 1941085 | |
| b | 1940802 | |
| l | 1940802 | |
| a | 1823162 | |
| p | 1047369 | 7.5% |
| m | 1047086 | 7.5% |
| y | 1047086 | 7.5% |
| o | 117923 | 0.8% |
| i | 117640 | 0.8% |
| Other values (2) | 566 | < 0.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 1823172 | |
| E | 1047086 | |
| M | 117640 | 3.9% |
| D | 283 | < 0.1% |
| V | 10 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 1823162 | |
| 3 | 1047086 | |
| 4 | 117640 | 3.9% |
| 5 | 283 | < 0.1% |
| 2 | 10 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 5976362 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2988181 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 16882243 | |
| Common | 11952724 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 2870531 | |
| e | 1941085 | |
| b | 1940802 | |
| l | 1940802 | |
| T | 1823172 | |
| a | 1823162 | |
| p | 1047369 | 6.2% |
| E | 1047086 | 6.2% |
| m | 1047086 | 6.2% |
| y | 1047086 | 6.2% |
| Other values (7) | 354062 | 2.1% |
Common
| Value | Count | Frequency (%) |
| 5976362 | ||
| - | 2988181 | |
| 1 | 1823162 | 15.3% |
| 3 | 1047086 | 8.8% |
| 4 | 117640 | 1.0% |
| 5 | 283 | < 0.1% |
| 2 | 10 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 28834967 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 5976362 | ||
| - | 2988181 | |
| t | 2870531 | |
| e | 1941085 | 6.7% |
| b | 1940802 | 6.7% |
| l | 1940802 | 6.7% |
| T | 1823172 | 6.3% |
| 1 | 1823162 | 6.3% |
| a | 1823162 | 6.3% |
| p | 1047369 | 3.6% |
| Other values (14) | 4660339 |
click_os
Categorical
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 198.8 MiB |
| 17 - Firefox OS | |
|---|---|
| 2 - iOS | |
| 20 - Chromecast | |
| 12 - tvOS | 60096 |
| 13 - Chrome OS | 23711 |
| Other values (3) | 7951 |
Characters and Unicode
| Total characters | 38114007 |
|---|---|
| Distinct characters | 36 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 20 - Chromecast |
|---|---|
| 2nd row | 20 - Chromecast |
| 3rd row | 17 - Firefox OS |
| 4th row | 17 - Firefox OS |
| 5th row | 17 - Firefox OS |
Common Values
| Value | Count | Frequency (%) |
| 17 - Firefox OS | 1738138 | |
| 2 - iOS | 788699 | |
| 20 - Chromecast | 369586 | 12.4% |
| 12 - tvOS | 60096 | 2.0% |
| 13 - Chrome OS | 23711 | 0.8% |
| 19 - Brew MP | 6384 | 0.2% |
| 5 - Windows Mobile | 1513 | 0.1% |
| 3 - Android | 54 | < 0.1% |
Category Frequency Plot
| Value | Count | Frequency (%) |
| 2988181 | ||
| os | 1761849 | |
| 17 | 1738138 | |
| firefox | 1738138 | |
| 2 | 788699 | 7.3% |
| ios | 788699 | 7.3% |
| 20 | 369586 | 3.4% |
| chromecast | 369586 | 3.4% |
| 12 | 60096 | 0.6% |
| tvos | 60096 | 0.6% |
| Other values (10) | 71221 | 0.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 7746108 | ||
| - | 2988181 | 7.8% |
| O | 2610644 | 6.8% |
| S | 2610644 | 6.8% |
| i | 2529917 | 6.6% |
| e | 2139332 | 5.6% |
| r | 2137873 | 5.6% |
| o | 2134515 | 5.6% |
| 1 | 1828329 | 4.8% |
| x | 1738138 | 4.6% |
| Other values (26) | 9650326 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 14818667 | |
| Space Separator | 7746108 | |
| Uppercase Letter | 7374955 | |
| Decimal Number | 5186096 | 13.6% |
| Dash Punctuation | 2988181 | 7.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 2529917 | |
| e | 2139332 | |
| r | 2137873 | |
| o | 2134515 | |
| x | 1738138 | |
| f | 1738138 | |
| t | 429682 | 2.9% |
| h | 393297 | 2.7% |
| m | 393297 | 2.7% |
| s | 371099 | 2.5% |
| Other values (8) | 813379 | 5.5% |
Uppercase Letter
| Value | Count | Frequency (%) |
| O | 2610644 | |
| S | 2610644 | |
| F | 1738138 | |
| C | 393297 | 5.3% |
| M | 7897 | 0.1% |
| B | 6384 | 0.1% |
| P | 6384 | 0.1% |
| W | 1513 | < 0.1% |
| A | 54 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 1828329 | |
| 7 | 1738138 | |
| 2 | 1218381 | |
| 0 | 369586 | 7.1% |
| 3 | 23765 | 0.5% |
| 9 | 6384 | 0.1% |
| 5 | 1513 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 7746108 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2988181 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 22193622 | |
| Common | 15920385 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| O | 2610644 | |
| S | 2610644 | |
| i | 2529917 | |
| e | 2139332 | |
| r | 2137873 | |
| o | 2134515 | |
| x | 1738138 | |
| f | 1738138 | |
| F | 1738138 | |
| t | 429682 | 1.9% |
| Other values (17) | 2386601 |
Common
| Value | Count | Frequency (%) |
| 7746108 | ||
| - | 2988181 | 18.8% |
| 1 | 1828329 | 11.5% |
| 7 | 1738138 | 10.9% |
| 2 | 1218381 | 7.7% |
| 0 | 369586 | 2.3% |
| 3 | 23765 | 0.1% |
| 9 | 6384 | < 0.1% |
| 5 | 1513 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 38114007 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 7746108 | ||
| - | 2988181 | 7.8% |
| O | 2610644 | 6.8% |
| S | 2610644 | 6.8% |
| i | 2529917 | 6.6% |
| e | 2139332 | 5.6% |
| r | 2137873 | 5.6% |
| o | 2134515 | 5.6% |
| 1 | 1828329 | 4.8% |
| x | 1738138 | 4.6% |
| Other values (26) | 9650326 |
click_country
Categorical
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 165.4 MiB |
| 1 | |
|---|---|
| 10 | 61377 |
| 11 | 29999 |
| 8 | 9556 |
| 6 | 7256 |
| Other values (6) | 27587 |
Characters and Unicode
| Total characters | 3079557 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 2852406 | |
| 10 | 61377 | 2.1% |
| 11 | 29999 | 1.0% |
| 8 | 9556 | 0.3% |
| 6 | 7256 | 0.2% |
| 9 | 6746 | 0.2% |
| 2 | 6101 | 0.2% |
| 3 | 4540 | 0.2% |
| 5 | 3498 | 0.1% |
| 4 | 3389 | 0.1% |
| Value | Count | Frequency (%) |
| 1 | 2852406 | |
| 10 | 61377 | 2.1% |
| 11 | 29999 | 1.0% |
| 8 | 9556 | 0.3% |
| 6 | 7256 | 0.2% |
| 9 | 6746 | 0.2% |
| 2 | 6101 | 0.2% |
| 3 | 4540 | 0.2% |
| 5 | 3498 | 0.1% |
| 4 | 3389 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 2973781 | |
| 0 | 61377 | 2.0% |
| 8 | 9556 | 0.3% |
| 6 | 7256 | 0.2% |
| 9 | 6746 | 0.2% |
| 2 | 6101 | 0.2% |
| 3 | 4540 | 0.1% |
| 5 | 3498 | 0.1% |
| 4 | 3389 | 0.1% |
| 7 | 3313 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 3079557 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 2973781 | |
| 0 | 61377 | 2.0% |
| 8 | 9556 | 0.3% |
| 6 | 7256 | 0.2% |
| 9 | 6746 | 0.2% |
| 2 | 6101 | 0.2% |
| 3 | 4540 | 0.1% |
| 5 | 3498 | 0.1% |
| 4 | 3389 | 0.1% |
| 7 | 3313 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3079557 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 2973781 | |
| 0 | 61377 | 2.0% |
| 8 | 9556 | 0.3% |
| 6 | 7256 | 0.2% |
| 9 | 6746 | 0.2% |
| 2 | 6101 | 0.2% |
| 3 | 4540 | 0.1% |
| 5 | 3498 | 0.1% |
| 4 | 3389 | 0.1% |
| 7 | 3313 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3079557 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 2973781 | |
| 0 | 61377 | 2.0% |
| 8 | 9556 | 0.3% |
| 6 | 7256 | 0.2% |
| 9 | 6746 | 0.2% |
| 2 | 6101 | 0.2% |
| 3 | 4540 | 0.1% |
| 5 | 3498 | 0.1% |
| 4 | 3389 | 0.1% |
| 7 | 3313 | 0.1% |
click_region
Categorical
| Distinct | 28 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 167.6 MiB |
| 25 | |
|---|---|
| 21 | |
| 13 | |
| 8 | |
| 16 | |
| Other values (23) |
Characters and Unicode
| Total characters | 5435935 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 20 |
|---|---|
| 2nd row | 20 |
| 3rd row | 16 |
| 4th row | 16 |
| 5th row | 24 |
Common Values
| Value | Count | Frequency (%) |
| 25 | 804985 | |
| 21 | 464230 | |
| 13 | 320957 | 10.7% |
| 8 | 179339 | 6.0% |
| 16 | 164884 | 5.5% |
| 28 | 135793 | 4.5% |
| 24 | 130537 | 4.4% |
| 20 | 120884 | 4.0% |
| 5 | 96979 | 3.2% |
| 9 | 84693 | 2.8% |
| Other values (18) | 484900 |
| Value | Count | Frequency (%) |
| 25 | 804985 | |
| 21 | 464230 | |
| 13 | 320957 | 10.7% |
| 8 | 179339 | 6.0% |
| 16 | 164884 | 5.5% |
| 28 | 135793 | 4.5% |
| 24 | 130537 | 4.4% |
| 20 | 120884 | 4.0% |
| 5 | 96979 | 3.2% |
| 9 | 84693 | 2.8% |
| Other values (18) | 484900 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 1767881 | |
| 1 | 1247851 | |
| 5 | 931499 | |
| 8 | 330215 | 6.1% |
| 3 | 324997 | 6.0% |
| 6 | 241031 | 4.4% |
| 4 | 186510 | 3.4% |
| 7 | 144287 | 2.7% |
| 0 | 142879 | 2.6% |
| 9 | 118785 | 2.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5435935 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 1767881 | |
| 1 | 1247851 | |
| 5 | 931499 | |
| 8 | 330215 | 6.1% |
| 3 | 324997 | 6.0% |
| 6 | 241031 | 4.4% |
| 4 | 186510 | 3.4% |
| 7 | 144287 | 2.7% |
| 0 | 142879 | 2.6% |
| 9 | 118785 | 2.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 5435935 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 1767881 | |
| 1 | 1247851 | |
| 5 | 931499 | |
| 8 | 330215 | 6.1% |
| 3 | 324997 | 6.0% |
| 6 | 241031 | 4.4% |
| 4 | 186510 | 3.4% |
| 7 | 144287 | 2.7% |
| 0 | 142879 | 2.6% |
| 9 | 118785 | 2.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5435935 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 1767881 | |
| 1 | 1247851 | |
| 5 | 931499 | |
| 8 | 330215 | 6.1% |
| 3 | 324997 | 6.0% |
| 6 | 241031 | 4.4% |
| 4 | 186510 | 3.4% |
| 7 | 144287 | 2.7% |
| 0 | 142879 | 2.6% |
| 9 | 118785 | 2.2% |
click_referrer_type
Categorical
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 165.3 MiB |
| 2 | |
|---|---|
| 1 | |
| 5 | 80766 |
| 7 | 69798 |
| 6 | 20455 |
| Other values (2) | 20240 |
Characters and Unicode
| Total characters | 2988181 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2 |
|---|---|
| 2nd row | 2 |
| 3rd row | 2 |
| 4th row | 2 |
| 5th row | 2 |
Common Values
| Value | Count | Frequency (%) |
| 2 | 1602601 | |
| 1 | 1194321 | |
| 5 | 80766 | 2.7% |
| 7 | 69798 | 2.3% |
| 6 | 20455 | 0.7% |
| 4 | 19820 | 0.7% |
| 3 | 420 | < 0.1% |
Category Frequency Plot
| Value | Count | Frequency (%) |
| 2 | 1602601 | |
| 1 | 1194321 | |
| 5 | 80766 | 2.7% |
| 7 | 69798 | 2.3% |
| 6 | 20455 | 0.7% |
| 4 | 19820 | 0.7% |
| 3 | 420 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 1602601 | |
| 1 | 1194321 | |
| 5 | 80766 | 2.7% |
| 7 | 69798 | 2.3% |
| 6 | 20455 | 0.7% |
| 4 | 19820 | 0.7% |
| 3 | 420 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2988181 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 1602601 | |
| 1 | 1194321 | |
| 5 | 80766 | 2.7% |
| 7 | 69798 | 2.3% |
| 6 | 20455 | 0.7% |
| 4 | 19820 | 0.7% |
| 3 | 420 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2988181 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 1602601 | |
| 1 | 1194321 | |
| 5 | 80766 | 2.7% |
| 7 | 69798 | 2.3% |
| 6 | 20455 | 0.7% |
| 4 | 19820 | 0.7% |
| 3 | 420 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2988181 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 1602601 | |
| 1 | 1194321 | |
| 5 | 80766 | 2.7% |
| 7 | 69798 | 2.3% |
| 6 | 20455 | 0.7% |
| 4 | 19820 | 0.7% |
| 3 | 420 | < 0.1% |