Skip to content

random effects model for combining barcodes #56

@nickzoic

Description

@nickzoic

We want to be able to use the random effects model for combining barcodes as if they were very small replicates of their own.

The current pivot-and-transform technique isn't suitable for this but we could just register the existing rml_estimate function as a UDF and do something like:

select random_effects_model(list(score), list(sigma)) from {view.alias} group by {group_column}

eg:

c = duckdb.connect()
from countess.plugins.random_effects import rml_estimate
c.create_function("rml", rml_estimate, return_type="DOUBLE[]")
c.sql("create table a (a integer, b integer, c float, d float)");
c.sql("insert into a values (1,1,3.0,1.0), (1,2,2.0,1.5), (2,1,0.5,0.25), (2,2,0.75,0.25)")
c.sql("insert into a values(1,3,1.5,0.5)")
c.sql("select * from a")
c.sql("select a, _R[1] as score, _R[2] as sigma from (select a, rml(list(c), list(d),50,1E-7) as _R from a group by a)")
┌───────┬───────┬───────┬───────┐
│   a   │   b   │   c   │   d   │
│ int32 │ int32 │ float │ float │
├───────┼───────┼───────┼───────┤
│     1 │     1 │   3.0 │   1.0 │
│     1 │     2 │   2.0 │   1.5 │
│     2 │     1 │   0.5 │  0.25 │
│     2 │     2 │  0.75 │  0.25 │
│     1 │     3 │   1.5 │   0.5 │
└───────┴───────┴───────┴───────┘

┌───────┬────────────────────┬─────────────────────┐
│   a   │       score        │        sigma        │
│ int32 │       double       │       double        │
├───────┼────────────────────┼─────────────────────┤
│     1 │ 1.9031885740021728 │  0.5180484599374635 │
│     2 │ 0.6250000000000001 │ 0.17677669529663692 │
└───────┴────────────────────┴─────────────────────┘

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions