sql-计算Postgresql中的累计总数

我正在使用countgroup by569来获取每天注册的订户数:

  SELECT created_at, COUNT(email)  
    FROM subscriptions 
GROUP BY created at;

结果:

created_at  count
-----------------
04-04-2011  100
05-04-2011   50
06-04-2011   50
07-04-2011  300

我想每天获取累积的订户总数。 我怎么得到这个?

created_at  count
-----------------
04-04-2011  100
05-04-2011  150
06-04-2011  200
07-04-2011  500
khairul asked 2020-01-12T21:16:52Z
5个解决方案
88 votes

对于较大的数据集,窗口函数是执行此类查询的最有效方法-表格将仅扫描一次,而不是像自动联接一样对每个日期扫描一次。 它看起来也简单得多。 :) PostgreSQL 8.4及更高版本支持窗口功能。

看起来是这样的:

SELECT created_at, sum(count(email)) OVER (ORDER BY created_at)
FROM subscriptions
GROUP BY created_at;

(email, created_at)在这里创建窗口; ORDER BY created_at表示它必须对created_at订单中的计数求和。


编辑:如果要在一天内删除重复的电子邮件,可以使用(email, created_at)。不幸的是,这不会删除跨越不同日期的重复邮件。

如果要删除所有重复项,我认为最简单的方法是使用子查询和(email, created_at)。这会将电子邮件归因于最早的日期(因为我按created_at的升序排序,因此会选择最早的日期):

SELECT created_at, sum(count(email)) OVER (ORDER BY created_at)
FROM (
    SELECT DISTINCT ON (email) created_at, email
    FROM subscriptions ORDER BY email, created_at
) AS subq
GROUP BY created_at;

如果您在(email, created_at)上创建索引,则此查询也不应该太慢。


(如果要测试,这就是我创建示例数据集的方式)

create table subscriptions as
   select date '2000-04-04' + (i/10000)::int as created_at,
          'foofoobar@foobar.com' || (i%700000)::text as email
   from generate_series(1,1000000) i;
create index on subscriptions (email, created_at);
intgr answered 2020-01-12T21:17:38Z
7 votes

采用:

SELECT a.created_at,
       (SELECT COUNT(b.email)
          FROM SUBSCRIPTIONS b
         WHERE b.created_at <= a.created_at) AS count
  FROM SUBSCRIPTIONS a
OMG Ponies answered 2020-01-12T21:17:58Z
2 votes
SELECT
  s1.created_at,
  COUNT(s2.email) AS cumul_count
FROM subscriptions s1
  INNER JOIN subscriptions s2 ON s1.created_at >= s2.created_at
GROUP BY s1.created_at
Andriy M answered 2020-01-12T21:18:13Z
2 votes

我假设您每天只需要一行,并且您仍然希望显示没有任何订阅的日期(假设没有人订阅某个日期,是否要显示前一天余额的日期?)。 在这种情况下,您可以使用“ with”功能:

with recursive serialdates(adate) as (
    select cast('2011-04-04' as date)
    union all
    select adate + 1 from serialdates where adate < cast('2011-04-07' as date)
)
select D.adate,
(
    select count(distinct email)
    from subscriptions
    where created_at between date_trunc('month', D.adate) and D.adate
)
from serialdates D
Endy Tjahjono answered 2020-01-12T21:18:34Z
-3 votes

最好的方法是拥有一个日历表:日历(   日期,   月int,   四分之一整数,   半整数   周整数   年int)

然后,您可以加入此表以为所需字段做摘要。

mentat answered 2020-01-12T21:18:58Z
translate from https://stackoverflow.com:/questions/5698452/count-cumulative-total-in-postgresql