Hello,
Would you consider adding a MODE function to alasql? This is very useful in descriptive statistics. It is also helpful for word cloud generation or simply looking at the most frequently used value of something.
Here is a sample implementation I created using various LLMs (Claude Sonnet, Claude Opus, Gemini Pro). It may not be 100% correct so please review it first, but I did try to keep it ANSI SQL compliant.
/**
* MODE - Aggregate function for AlaSQL v4.17
*
* Returns the most frequently occurring non-NULL value in a group.
* Ties are broken by returning the smallest value (per ANSI SQL:2023
* ISO/IEC 9075-2:2023 inverse distribution function semantics).
* NULL inputs are ignored per standard aggregate NULL-handling.
*
* Note: There's a subtlety in v4.17: in stage 2,
* the accumulator variable name in the function signature
* is what carries state. This accounts for that.
*
* Usage in SQL:
* SELECT MODE(column) FROM table GROUP BY ...
*/
alasql.aggr.MODE = function (value, accumulator, stage) {
if (stage === 1) {
var acc = { counts: Object.create(null), values: [] };
if (value != null) {
var key = String(value);
acc.counts[key] = 1;
acc.values.push(value);
}
return acc;
} else if (stage === 2) {
if (value != null) {
var key = String(value);
if (accumulator.counts[key] == null) {
accumulator.counts[key] = 1;
accumulator.values.push(value);
} else {
accumulator.counts[key]++;
}
}
return accumulator;
} else if (stage === 3) {
var counts = accumulator.counts;
var values = accumulator.values;
var maxCount = 0;
var candidates = [];
for (var i = 0; i < values.length; i++) {
var key = String(values[i]);
var count = counts[key];
if (count > maxCount) {
maxCount = count;
candidates = [values[i]];
} else if (count === maxCount) {
candidates.push(values[i]);
}
}
if (candidates.length === 0) return null;
if (candidates.length === 1) return candidates[0];
candidates.sort(function (a, b) {
if (a < b) return -1;
if (a > b) return 1;
return 0;
});
return candidates[0];
}
};
(Note: I use this code in DataLaVista, but since it's also open source, you are free to reuse the code here).
Regards,
Gabriel Mongefranco
Mobile Data Architect
Eisenberg Family Depression Center, University of Michigan
Hello,
Would you consider adding a MODE function to alasql? This is very useful in descriptive statistics. It is also helpful for word cloud generation or simply looking at the most frequently used value of something.
Here is a sample implementation I created using various LLMs (Claude Sonnet, Claude Opus, Gemini Pro). It may not be 100% correct so please review it first, but I did try to keep it ANSI SQL compliant.
(Note: I use this code in DataLaVista, but since it's also open source, you are free to reuse the code here).
Regards,
Gabriel Mongefranco
Mobile Data Architect
Eisenberg Family Depression Center, University of Michigan