DevHeads.net

Performance

postgres 8.4, COPY, and high concurrency

I was working on a data warehousing project where a fair number of files
could be COPY'd more or less directly into tables. I have a somewhat nice
machine to work with, and I ran on 75% of the cores I have (75% of 32 is
24).

Performance was pretty bad. With 24 processes going, each backend (in COPY)
spent 98% of it's time in semop (as identified by strace).

performance regression with 9.2

This query is a couple orders of magnitude slower the first result is
9.2.1, the second 9.1

=# explain analyze SELECT note_sets."id" AS t0_r0,
note_sets."note_set_source_id" AS t0_r1, note_sets."parent_id" AS t0_r2,
note_sets."business_entity_id" AS t0_r3, note_sets."created_at" AS t0_r4,
note_sets."updated_at" AS t0_r5, note_sets."created_by" AS t0_r6,
note_sets."updated_by" AS t0_r7, note_set_sources."id" AS t1_r0,
note_set_sources."name" AS t1_r1, note_set_sources."code" AS t1_r2,
note_set_sources."description" AS t1_r3, note_set_sources."status" AS
t1_r4, note_set_sources."created_at" AS

PostreSQL v9.2 uses a lot of memory in Windows XP

Hi,

I had installed postgreSQL v9.2 in Windows XP SP3.

My PC specs:
Processor: Pentium Dual Core 2.09 GHz
RAM: 2GB

The postgreSQL is run as windows service (manual).

The problem is the postgreSQL service uses a lot of memory and lags
the OS if running in long time (about 2 hours or more) so I had to
restart the postgreSQL service everytime it happened. I never do any
big querying process so far.

fast read of binary data

Dear All

I am currently implementing using a compressed binary storage scheme
genotyping data. These are basically vectors of binary data which may be
megabytes in size.

Our current implementation uses the data type bit varying.

What we want to do is very simple: we want to retrieve such records from
the database and transfer it unaltered to the client which will do
something (uncompressing) with it.

Index is not using

Hi All

I am facing query performance in one of my testing server.

How i can create index with table column name ?

EXPLAIN select xxx.* from xxx xxx where exists (select 1 from tmp
where mdc_domain_reverse like xxx.reverse_pd || '.%');

QUERY PLAN

Planner sometimes doesn't use a relevant index with IN (subquery) condition

[Please CC me on replies, as I'm not subscribed; thank you.]

I've ran into a problem with the query planner and IN (subquery)
conditions which I suspect to be a bug. I'll attempt to describe the
relevant details of my database and explain which behaviour I find
unexpected. I've also tried to trigger this behaviour in a clean
database; I think I've succeeded, but the conditions are a bit
different, so perhaps it's a different problem.

parallel query evaluation

Hi,

I have

create table x ( att bigint, val bigint, hash varchar(30)
);

with 693million rows. The query

create table y as select att, val, count(*) as cnt from x
group by att, val;

ran for more than 2000 minutes and used 14g memory on an 8g physical
RAM machine -- eventually I stopped it.

Thousands databases or schemas

We have a web application where we create a schema or a database with a
number of tables in it for each customer. Now we have about 2600 clients.

The problem we met using a separate DB for each client is that the creation
of new DB can take up to 2 minutes, that is absolutely unacceptable. Using
schemes instead (one DB with a number of schemes containing similar tables
in it) solved this problem (schemes are created in a couple of seconds), but
created two other blocking points:
1. sometimes creation of a new table in a schema takes up to 5 seconds.

Unique values across a table of arrays - documents and tags

Hello,

I know I need to re-engineer this so it doesn't suck by design, so I'm
wondering if there is some nifty PostgreSQL feature or best practice
which may automagically do the best thing.

I store information about documents which are tagged by string tags.

HT on or off for E5-26xx ?

I'm bringing up a new type of server using Intel E5-2620 (unisocket)
which was selected for good SpecIntRate performance vs cost/power (201
for $410 and 95W).

Was assuming it was 6-core but I just noticed it has HT which is
currently enabled since I see 12 cores in /proc/cpuinfo

Question for the performance experts : is it better to have HT enabled
or disabled for this generation of Xeon ?
Workload will be moderately concurrent, small OLTP type transactions.
We'll also run a few low-load VMs (using KVM) and a big Java application.

Any thoughts welcome.
Thanks.

Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2

Hello, this is my first message to this list, so sorry if this is not
the right place to discuss this or if some data is missing from this
message.

I'll gladly send any data you request that would help us to understand
this issue.

DBT-1

where should i get the original working source code of DBT-1 benchmark and it is compatible with postgres 9.2?

Best Regards

DBT-1

where should i get the original working source code of DBT-1 bechmark and it
is compatible with postgres 9.2?

Suggested test points for a performance tool?

Hello everyone,

If one would create a postgres performance tool, what would one like to measure aside from the obvious things common to all databases (query times, locks etc)?

I'm in the slightly difficult situation of being asked to port "Jet Profiler for MySQL" (<a href="http://www.jetprofiler.com" title="http://www.jetprofiler.com">http://www.jetprofiler.com</a>) for use with postgres (a "Jet Profiler for PostgreSQL" if you will).

Constraint exclusion in views

Hi list.

I've been battling with a design issue here.

I have postgres 9.0.x deployed in some databases, and was designing
some changes that involve querying in a very partition-like way, but
not quite.

In particular, I have a few tables (lets call them table1...tableN). N
is pretty small here, but it might grow over time.

freebsd or linux

I'm running a server with lots of counts and calculations.
currently its ubuntu server is freebsd faster?

also this is a i386 machine.

or linux and bsd is about the same.

this is not to be an argument just looking. Current benchmarks to compare

thanks

help with too slow query

Hello,
I have this table definition:
CREATE TABLE ism_floatvalues
(
id_signal bigint NOT NULL, -- Indica la señal a la que pertenece este
valor. Clave foránea que referencia al campo id_signal de la tabla
ism_signal.
time_stamp timestamp without time zone NOT NULL, -- Marca de tiempo
que indica fecha y hora correpondiente a este dato.

dbt2 performance regresses from 9.1.6 to 9.2.1

Hi there,

I work for VMware with our Postgres performance team. We recently came across a dbt2 performance regression from 9.1.6 to 9.2.1. We have done some profiling and don't see anything obvious.

dbt2 performance regresses from 9.1.6 to 9.2.1

Hi there,

I work for VMware with our Postgres performance team. We recently came across a dbt2 performance regression from 9.1.6 to 9.2.1. We have done some profiling and don't see anything obvious.

Seq scan on big table, episode 2

Hi all :)

I'm here again.
This time I'll provide more details (explain analyze, data-type, and
indexes), hope it will be enough :)

The query that is performing a plan that i do not understand is the
following:
This is the explain analyze:

Invalid memory alloc request size

dear friends

i have - sql file of size more than 1 gb
when i execute it then after some time "Invalid memory alloc request size
100234023 byte" occcured
what ' s problem that i don't know ?

with thanks
mahavir

Increasing work_mem and shared_buffers on Postgres 9.2 significantly slows down queries

Hello,

I have a PostgreSQL 9.2 instance running on RHEL 6.3, 8-core machine with
16GB of RAM. The server is dedicated to this database, the disks are local
RAID10. Given that the default postgresql.conf is quite conservative
regarding memory settings, I thought it might be a good idea to allow
Postgres to use more memory.

pg_buffercache

Hi all,

I was wondering if it is safe to install pg_buffercache on production
systems?

Thank you.

High %SYS CPU usage

Hello there,

I have PostgreSQL 8.3.18 server running on Centos 6.2 (2.6.32-220.7.1) with
this specs:

2x CPU AMD Opteron 6282
128GB RAM
Raid 10 (12HD 15k rpm 1GB cache) with data
Raid 10 (4HD 15k rpm 1GB cache) with xlog
Raid 1 (15k rpm 1GB cache shared with xlog) with system

On this server I have only one database with 312GB of data.

Seq scan on 10million record table.. why?

Hi all

I have a problem with a data import procedure that involve the following query:

select a,b,c,d
from big_table b
join data_sequences_table ds
on b.key1 = ds.key1 and b.key2 = ds.key2
where ds.import_id=xxxxxxxxxx

The "big table" has something like 10.000.000 records ore more
(depending on the table, there are more than one of them).
The data are uploaded in 20k record blocks, and the keys are written
on "data_sequences_table".

Re: [PERFORM] How to keep queries low latency as concurrency increases

You should carefully test transaction-based pools limited to around 8
DB connections. Experiment with different size limits.

<a href="http://wiki.postgresql.org/wiki/Number_Of_Database_Connections" title="http://wiki.postgresql.org/wiki/Number_Of_Database_Connections">http://wiki.postgresql.org/wiki/Number_Of_Database_Connections</a>

PostgreSQL server failed to start

Hi,
When i start my postgres. Iam getting this error. I had
installed 8.4 and 9.1
It was working good yesterday but not now.
service postgresql restart
* Restarting PostgreSQL 8.4databaseserver
* The PostgreSQL server failed to start. Please check the log output.

If i see the log. it shows yesterday's log report. please give me suggestion
Thanks for reply.

out of memory

hi

i have sql file (it's size are 1GB )
when i execute it then the String is 987098801 bytr too long for encoding
conversion error occured .
pls give me solution about

i have XP 64-bit with 8 GB RAM shared_buffer 1GB check point = 34

with thanks
mahavir

Slow query, where am I going wrong?

Hello all,

I have been pulling my hair out over the last few days trying to get any useful performance out of the following
painfully slow query.
The query is JPA created, I've just cleaned the aliases to make it more readable.
Using 'distinct' or 'group by' deliver about the same results, but 'distinct' is marginally better.
Hardware is pretty low end (a test box), but is mostly dedicated to PostgreSQL.
The box spec and configuration is included at the end of this post - Some of the values have been changed just to see if
things get better.
Inserts have also become extremely slow.

How to keep queries low latency as concurrency increases

As I increase concurrency I'm experiencing what I believe are too slow
queries given the minuscule amount of data in my tables.

I have 20 Django worker processes and use ab to generate 3000 requests
to a particular URL which is doing some read only queries. I ran this
with ab concurrency level set to 4, 12 and 20.

Re: [PERFORM] Request for help with slow query

Higher OFFSET settings may be slow because it has to read through
OFFSET result rows before returning anything.

Re: [PERFORM] Setting Statistics on Functional Indexes

Yeah, this has been a recurring problem with database statistics
with various products for at least 20 years. For a while I was using
a product whose optimizer engineers referred to it as "data skew" and
recommended adding a "dummy" entry to get a single value out past the
maximum end of the range.

Request for help with slow query

Hi, thanks for any help. I've tried to be thorough, but let me know if I should
provide more information.

A description of what you are trying to achieve and what results you expect:
I have a large (3 million row) table called "tape" that represents files,
which I join to a small (100 row) table called "filesystem" that represents
filesystems.

Replaying 48 WAL files takes 80 minutes

I am configuring streaming replication with hot standby
with PostgreSQL 9.1.3 on RHEL 6 (kernel 2.6.32-220.el6.x86_64).
PostgreSQL was compiled from source.

It works fine, except that starting the standby took for ever:
it took the system more than 80 minutes to replay 48 WAL files
and connect to the primary.

Can anybody think of an explanation why it takes that long?

This is decent hardware: 24 cores of AMD Opteron 6174, 128 GB RAM,
NetApp SAN attached with 8 GBit Fibrechannel (ext4 file system).
An identical system performed fine in performance tests.

Here is the log; I have edited it fo

Slower Performance on Postgres 9.1.6 vs 8.2.11

All... first let me say thank you for this forum.... I am new to it and
relatively new to postgres, more of a sysadmin than a DBA, but let me
explain my issue. I'll try also to post relevant information as well.

Our IT group took over an app that we have running using postgres and it has
been on version 8.2.11 since we acquired it. It is time to get current, so
I have created instances of our production database that mirror exact
hardware for our existing implementation on version 8.2.11 (running Fedora
Core 8 - wow I know) and also version 9.1.6 on Fedora 17.

PSA: New Kernels and intel_idle cpuidle Driver!

Hey guys,

I have a pretty nasty heads-up. If you have hardware using an Intel XEON
and a newer Linux kernel, you may be experiencing very high CPU latency.
You can check yourself:

cat /sys/devices/system/cpu/cpuidle/current_driver

If it says intel_idle, the Linux kernel will *aggressively* put your CPU
to sleep. We definitely noticed this, and it's pretty darn painful. But
it's *more* painful in your asynchronous, standby, or otherwise less
busy nodes. Why?

As you can imagine, the secondary nodes don't get much activity, so
spend most of their time sleeping.

Re: [PERFORM] Query-Planer from 6seconds TO DAYS

You can set those per tablespace.

BAD performance with enable_bitmapscan = on with Postgresql 9.0.X (X = 3 and 10)

Hi,
I have a tree-structure managed with ltree and gist index.
Simplified schema is

CREATE TABLE crt (
idcrt INT NOT NULL,
...
pathname LTREE
)
idcrt primary key and other index ix_crt_pathname on pathname with gist

CREATE TABLE doc (
iddoc INT NOT NULL, ...)
iddoc primary key

CREATE TABLE folder_document (
id_folder int not null,
id_document int not null,
...
path_folder ltree not null
);
id_folder , id_document are primary key
ix_folder_document_path_folder on path_folder with gist

when enable_bitmapscan is set on query go on 1000 seconds, when I turned
off bitmapscan query go on

Re: [PERFORM] Query-Planer from 6seconds TO DAYS

The goal should be to set the cost factors so that they model actual
costs for you workload in your environment. In what cases have you
seen the sequential scan of a large number of adjacent pages from
disk take longer than randomly reading the same number of pages from
disk? (I would love to see the bonnie++ number for that, if you have
them.)

-Kevin

Re: [PERFORM] Query-Planer from 6seconds TO DAYS

The idea is to model actual costs on your system.

Re: [PERFORM] How to upgrade from 9.1 to 9.2 with replication?

... as long as the rsync was bracketed by calls to pg_start_backup()
and pg_stop_backup().

-Kevin

Re: [PERFORM] Out of shared mem on new box with more mem, 9.1.5 -> 9.1.6

One thing to look for is a connection stuck in "idle in transaction"
or old prepared transactions in pg_prepared_xacts. Either will cause
all sorts of problems, but if you are using serializable transactions
the error you are seeing is often the first symptom. I'm starting to
think we should add something about that to the hint.

On the other hand, it could just be that you need to increase the
setting the hint currently references. For complex databases it is
definitely on the low side.

Setting Statistics on Functional Indexes

Hey everyone,

So recently we upgraded to 9.1 and have noticed a ton of our queries got
much worse. It turns out that 9.1 is *way* more optimistic about our
functional indexes, even when they're entirely the wrong path. So after
going through the docs, I see that the normal way to increase stats is
to alter columns directly on a table, or change the
default_statistics_target itself.

But there doesn't seem to be any way to increase stats for a functional
index unless you modify default_statistics_target.

Query-Planer from 6seconds TO DAYS

Hi,

i've got a very strange problem on PostgreSQL 8.4, where the queryplaner goes absolutely havoc, when slightly changing one parameter.

First the Tables which are involved:
1.

Re: [PERFORM] Query with limit goes from few ms to hours

Of course, if your autovacuum settings are aggressive enough, you
should gernerally not need to run ANALYZE explicitly. You should
double-check that autovacuum is turned on and configured at least as
aggressively as the default settings, or you will probably get little
surprises like this when you least expect them.

-Kevin

Prepared statements slow in 9.2 still (bad query plan)

I have a problem with prepared statements choosing a bad query plan - I was hoping that 9.2 would have eradicated the problem :(

Taken from the postgresql log:

<2012-10-23 15:21:03 UTC acme_metastore 13798 5086b49e.35e6> LOG: duration: 20513.809 ms execute S_6: SELECT S.Subj, S.Prop, S.Obj
FROM jena_g1t1_stmt S WHERE S.Obj = $1 AND S.Subj = $2 AND S.Prop = $3 AND S.GraphID = $4

<2012-10-23 15:21:03 UTC acme_metastore 13798 5086b49e.35e6> DETAIL: parameters: $1 =

Tons of free RAM. Can't make it go away.

Hey everyone!

This is pretty embarrassing, but I've never seen this before. This is
our system's current memory allocation from 'free -m':

total used free buffers cached
Mem: 72485 58473 14012 3 34020
-/+ buffers/cache: 24449 48036

So, I've got 14GB of RAM that the OS is just refusing to use for disk or
page cache. Does anyone know what might cause that?

Our uname -sir, for reference:

Linux 3.2.0-31-generic x86_64

Connection Options -- SSL already uses compression?

Am I reading this correctly -- it appears that if SSL negotiation is
enabled for a connection (say, when using pg_basebackup over a WAN) that
compression /*is automatically used*/ (provided it is supported on both
ends)?

Is there a way to check and see if it _*is*_ on for a given connection?

I was looking to hack in zlib support and saw that appears to be
already-present support, provided SSL connection security is enabled.

Tablespace-derived stats?

Hello Perf,

Lately I've been pondering. As systems get more complex, it's not
uncommon for tiered storage to enter the picture. Say for instance, a
user has some really fast tables on a NVRAM-based device, and
slower-access stuff on a RAID, even slower stuff on an EDB, and variants
like local disk or a RAM drive.

Yet there's only one global setting for random_page_cost, and
seq_page_cost, and so on.

Would there be any benefit at all to adding these as parameters to the
tablespaces themselves?

Recursive query gets slower when adding an index

Hi,

I have a self-referencing table that defines a hierarchy of projects and sub-projects.

This is the table definition:

CREATE TABLE project
(
project_id integer primary key,
project_name text,
pl_name text,
parent_id integer
);

ALTER TABLE project
ADD CONSTRAINT project_parent_id_fkey FOREIGN KEY (parent_id)
REFERENCES project (project_id)
ON UPDATE NO ACTION
ON DELETE NO ACTION;

The table contains ~11000 rows

The following statement:

with recursive project_tree as (
select project_id,
parent_id,
pl_name as root

How to upgrade from 9.1 to 9.2 with replication?

I have replication set up on servers with 9.1 and want to upgrade to 9.2
I was hoping I could just bring them both down, upgrade them both and bring
them both up and continue replication, but that doesn't seem to work, the
replication server won't come up.
Is there anyway to do this upgrade with out taking a new base backup and
rebuilding the replication drive?

Unused index influencing sequential scan plan

Hi all,

I've created a test table containing 21 million random dates and
times, but I get wildly different results when I introduce a
functional index then ANALYSE again, even though it doesn't use the
index:

postgres=# CREATE TABLE test (id serial, sampledate timestamp);
CREATE TABLE
postgres=# INSERT INTO test (sampledate) SELECT '1970-01-01
00:00:00'::timestamp + (random()*1350561945 || ' seconds')::interval
FROM generate_series(1,21000000);
INSERT 0 21000000
postgres=# VACUUM;
VACUUM
postgres=# EXPLAIN SELECT extract(month FROM sampledate), count(*)
FROM test GROUP BY extract(month FROM

High cost estimates when n_distinct is set

Hello

I am working on a potentially large database table, let's call it "observation", that has a foreign key to table "measurement". Each measurement is associated with either none or around five observations. In this kind of situation, it is well known that the statistics on the foreign key column in observation table can get arbitrarily bad as the row count increases. Especially, the estimate of the number of distinct values in the foreign key column can be completely off.

To combat this issue I have set n_distinct=-0.2 on the foreign key column.

Out of shared mem on new box with more mem, 9.1.5 -> 9.1.6

We've run into a perplexing issue with a customer database.

pgbounce max_client_conn and default_pool_size

What is the adequate *pgbounce* *max_client_conn ,default_pool_size* values
for a postgres config which has *max_connections = 400*.

We want to move to pgbouncer to let postgres do the only db job but it
confused us. We have over 120 databases in a single postgres engine with
as i said max_connections = 400 .

120+ databases are driven from 3 separate web servers, we are planning to
install pgbounce to three of them.

PS: in source code of pgbouncer it looks like opens separate fd's
for max_client_conn * default_pool_size value.

Regards,

Yetkin Öztürk

Have: Seq Scan - Want: Index Scan - what am I doing wrong?

Hi guys,

PG = 9.1.5
OS = winDOS 2008R8

I have a table that currently has 207 million rows.
there is a timestamp field that contains data.
more data gets copied from another database into this database.
How do I make this do an index scan instead?
I did an "analyze audittrailclinical" to no avail.
I tested different indexes - no same behavior.

The query does this:

SELECT
audittrailclinical.pgid,
audittrailclinical.timestamp,
mmuser.logon,
audittrailclinical.entityname,
audittrailclinical.clinicalactivity,
audittrailclinical.audittraileventcode,
account.accountnumber,

have: seq scan - want: index scan

Hi guys,

PG = 9.1.5
OS = winDOS 2008R8

I have a table that currently has 207 million rows.
there is a timestamp field that contains data.
more data gets copied from another database into this database.
How do I make this do an index scan instead?
I did an "analyze audittrailclinical" to no avail.
I tested different indexes - no same behavior.

The query does this:

SELECT
audittrailclinical.pgid,
audittrailclinical.timestamp,
mmuser.logon,
audittrailclinical.entityname,
audittrailclinical.clinicalactivity,
audittrailclinical.audittraileventcode,
account.accountnumber,

Support Create package

Hi,

Why PostgreSQL, the EnterpriseBD supports create/alter/drop package and the opensource doesn't?
Is a project or never will have support?

Thanks

Syndicate content