Tengo un "gran" base de datos espaciales: 72 GB, 200 millones de documentos con la geometría (puntos) Tengo una idea índice sobre las geometrías
la base de datos:
CREATE TABLE resources
(
id bigint NOT NULL,
accuracy varchar(3),
owner varchar(100),
title varchar(500),
posted timestamp,
nbtags int,
url text,
description text,
rawtags text,
tags text,
geom geometry NOT NULL,
CONSTRAINT resources_pkey PRIMARY KEY (id)
);
INDEX resources_index_rep ON resources USING GIST (geom);
CREATE RULE resources_rule AS ON INSERT TO resources
WHERE EXISTS(SELECT 1 FROM resources
WHERE id =NEW.id)
DO INSTEAD NOTHING;
Por supuesto que quiero a la consulta de esta base de datos. No obstante, es muy muy lento ... (Yo lo hice analiza y limpia)
por ejemplo, con zona de Francia:
EXPLAIN ANALYZE
SELECT count(*)
FROM resources
WHERE st_intersects(st_geomfromtext('POLYGON((-5.134723 41.364166,-5.134723 51.09111,9.562222 51.09111,9.562222 41.364166,-5.134723 41.364166))',4326), geom)
QUERY PLAN
Aggregate (cost=14411182.45..14411182.46 rows=1 width=0) (actual time=17368911.185..17368911.186 rows=1 loops=1)
-> Bitmap Heap Scan on resources (cost=890493.27..14391392.83 rows=7915848 width=0) (actual time=2627029.235..17357552.440 rows=21381758 loops=1)
Recheck Cond: ('0103000020E6100000010000000500000077137CD3F48914C05628D2FD9CAE444077137CD3F48914C0562B137EA98B4940F52EDE8FDB1F2340562B137EA98B4940F52EDE8FDB1F23405628D2FD9CAE444077137CD3F48914C05628D2FD9CAE4440'::geometry && geom)
Filter: _st_intersects('0103000020E6100000010000000500000077137CD3F48914C05628D2FD9CAE444077137CD3F48914C0562B137EA98B4940F52EDE8FDB1F2340562B137EA98B4940F52EDE8FDB1F23405628D2FD9CAE444077137CD3F48914C05628D2FD9CAE4440'::geometry, geom)
-> Bitmap Index Scan on resources_index_rep (cost=0.00..888514.31 rows=23747545 width=0) (actual time=2626090.609..2626090.609 rows=21381788 loops=1)
Index Cond: ('0103000020E6100000010000000500000077137CD3F48914C05628D2FD9CAE444077137CD3F48914C0562B137EA98B4940F52EDE8FDB1F2340562B137EA98B4940F52EDE8FDB1F23405628D2FD9CAE444077137CD3F48914C05628D2FD9CAE4440'::geometry && geom)
Total runtime: 17368946.473 ms
7 row(s)
le tomó casi 5 horas.
y con Suiza:
EXPLAIN ANALYZE
SELECT count(*)
FROM resources
WHERE st_intersects(st_geomfromtext('POLYGON((5.96611 45.829437,5.96611 47.806938,10.488913 47.806938,10.488913 45.829437,5.96611 45.829437))',4326), geom)
QUERY PLAN
Aggregate (cost=5623425.68..5623425.69 rows=1 width=0) (actual time=6798842.064..6798842.064 rows=1 loops=1)
-> Bitmap Heap Scan on resources (cost=88982.29..5621452.65 rows=789212 width=0) (actual time=783383.496..6797250.482 rows=2539057 loops=1)
Recheck Cond: ('0103000020E61000000100000005000000AF5A99F04BDD1740D28BDAFD2AEA4640AF5A99F04BDD174028F38FBE49E74740B22D03CE52FA244028F38FBE49E74740B22D03CE52FA2440D28BDAFD2AEA4640AF5A99F04BDD1740D28BDAFD2AEA4640'::geometry && geom)
Filter: _st_intersects('0103000020E61000000100000005000000AF5A99F04BDD1740D28BDAFD2AEA4640AF5A99F04BDD174028F38FBE49E74740B22D03CE52FA244028F38FBE49E74740B22D03CE52FA2440D28BDAFD2AEA4640AF5A99F04BDD1740D28BDAFD2AEA4640'::geometry, geom)
-> Bitmap Index Scan on resources_index_rep (cost=0.00..88784.99 rows=2367636 width=0) (actual time=782664.847..782664.847 rows=2539063 loops=1)
Index Cond: ('0103000020E61000000100000005000000AF5A99F04BDD1740D28BDAFD2AEA4640AF5A99F04BDD174028F38FBE49E74740B22D03CE52FA244028F38FBE49E74740B22D03CE52FA2440D28BDAFD2AEA4640AF5A99F04BDD1740D28BDAFD2AEA4640'::geometry && geom)
Total runtime: 6798852.955 ms
7 row(s)
le tomó casi 2 horas.
Estoy buscando la manera de mejorar el rendimiento?
Acerca de nuestro servidor:
- CPU: 4
- RAM: 16 GB
- OS: Ubuntu 12.04 LTS 64bit