sql - Spatial query on large table with multiple self joins performing slow -
i working on queries on large table in postgres 9.3.9. spatial dataset , spatially indexed. say, have need find 3 types of objects: a, b , c. criteria b , c both within distance of a, 500 meters.
my query this:
select school.osm_id school_osm_id, school.name school_name, school.way school_way, restaurant.osm_id restaurant_osm_id, restaurant.name restaurant_name, restaurant.way restaurant_way, bar.osm_id bar_osm_id, bar.name bar_name, bar.way bar_way ( select osm_id, name, amenity, way, way_geo planet_osm_point amenity = 'school') school, (select osm_id, name, amenity, way, way_geo planet_osm_point amenity = 'restaurant') restaurant, (select osm_id, name, amenity, way, way_geo planet_osm_point amenity = 'bar') bar st_dwithin(school.way_geo, restaurant.way_geo, 500, false) , st_dwithin(school.way_geo, bar.way_geo, 500, false);
this query gives me want, takes long time, 13 seconds execute. i'm wondering if there way write query , make more efficient.
query plan:
nested loop (cost=74.43..28618.65 rows=1 width=177) (actual time=33.513..11235.212 rows=10591 loops=1) buffers: shared hit=530967 read=8733 -> nested loop (cost=46.52..28586.46 rows=1 width=174) (actual time=31.998..9595.212 rows=4235 loops=1) buffers: shared hit=389863 read=8707 -> bitmap heap scan on planet_osm_point (cost=18.61..2897.83 rows=798 width=115) (actual time=7.862..150.607 rows=8811 loops=1) recheck cond: (amenity = 'school'::text) buffers: shared hit=859 read=5204 -> bitmap index scan on idx_planet_osm_point_amenity (cost=0.00..18.41 rows=798 width=0) (actual time=5.416..5.416 rows=8811 loops=1) index cond: (amenity = 'school'::text) buffers: shared hit=3 read=24 -> bitmap heap scan on planet_osm_point planet_osm_point_1 (cost=27.91..32.18 rows=1 width=115) (actual time=1.064..1.069 rows=0 loops=8811) recheck cond: ((way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision)) , (amenity = 'restaurant'::text)) filter: ((planet_osm_point.way_geo && _st_expand(way_geo, 500::double precision)) , _st_dwithin(planet_osm_point.way_geo, way_geo, 500::double precision, false)) rows removed filter: 0 buffers: shared hit=389004 read=3503 -> bitmapand (cost=27.91..27.91 rows=1 width=0) (actual time=1.058..1.058 rows=0 loops=8811) buffers: shared hit=384528 read=2841 -> bitmap index scan on idx_planet_osm_point_waygeo (cost=0.00..9.05 rows=137 width=0) (actual time=0.193..0.193 rows=64 loops=8811) index cond: (way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision)) buffers: shared hit=146631 read=2841 -> bitmap index scan on idx_planet_osm_point_amenity (cost=0.00..18.41 rows=798 width=0) (actual time=0.843..0.843 rows=6291 loops=8811) index cond: (amenity = 'restaurant'::text) buffers: shared hit=237897 -> bitmap heap scan on planet_osm_point planet_osm_point_2 (cost=27.91..32.18 rows=1 width=115) (actual time=0.375..0.383 rows=3 loops=4235) recheck cond: ((way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision)) , (amenity = 'bar'::text)) filter: ((planet_osm_point.way_geo && _st_expand(way_geo, 500::double precision)) , _st_dwithin(planet_osm_point.way_geo, way_geo, 500::double precision, false)) rows removed filter: 1 buffers: shared hit=141104 read=26 -> bitmapand (cost=27.91..27.91 rows=1 width=0) (actual time=0.368..0.368 rows=0 loops=4235) buffers: shared hit=127019 -> bitmap index scan on idx_planet_osm_point_waygeo (cost=0.00..9.05 rows=137 width=0) (actual time=0.252..0.252 rows=363 loops=4235) index cond: (way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision)) buffers: shared hit=101609 -> bitmap index scan on idx_planet_osm_point_amenity (cost=0.00..18.41 rows=798 width=0) (actual time=0.104..0.104 rows=779 loops=4235) index cond: (amenity = 'bar'::text) buffers: shared hit=25410 total runtime: 11238.605 ms
i'm using 1 table @ moment 1,372,711 rows. has 73 columns:
column | type | modifiers --------------------+----------------------+--------------------------- osm_id | bigint | access | text | addr:housename | text | addr:housenumber | text | addr:interpolation | text | admin_level | text | aerialway | text | aeroway | text | amenity | text | area | text | barrier | text | bicycle | text | brand | text | bridge | text | boundary | text | building | text | capital | text | construction | text | covered | text | culvert | text | cutting | text | denomination | text | disused | text | ele | text | embankment | text | foot | text | generator:source | text | harbour | text | highway | text | historic | text | horse | text | intermittent | text | junction | text | landuse | text | layer | text | leisure | text | lock | text | man_made | text | military | text | motorcar | text | name | text | natural | text | office | text | oneway | text | operator | text | place | text | poi | text | population | text | power | text | power_source | text | public_transport | text | railway | text | ref | text | religion | text | route | text | service | text | shop | text | sport | text | surface | text | toll | text | tourism | text | tower:type | text | tunnel | text | water | text | waterway | text | wetland | text | width | text | wood | text | z_order | integer | tags | hstore | way | geometry(point,4326) | way_geo | geography | gid | integer | not null default nextval('... indexes: "planet_osm_point_pkey1" primary key, btree (gid) "idx_planet_osm_point_amenity" btree (amenity) "idx_planet_osm_point_waygeo" gist (way_geo) "planet_osm_point_index" gist (way) "planet_osm_point_pkey" btree (osm_id)
there 8811, 6291, 779 rows in amenity school, restaurant , bar respectively.
this query should go long way (be much faster):
with school ( select s.osm_id school_id, text 'school' type, s.osm_id, s.name, s.way_geo planet_osm_point s , lateral ( select 1 planet_osm_point st_dwithin(way_geo, s.way_geo, 500, false) , amenity = 'bar' limit 1 -- bar exists -- selective first if possible ) b , lateral ( select 1 planet_osm_point st_dwithin(way_geo, s.way_geo, 500, false) , amenity = 'restaurant' limit 1 -- restaurant exists ) r s.amenity = 'school' ) select * ( table school -- schools union -- bars select s.school_id, 'bar', x.* school s , lateral ( select osm_id, name, way_geo planet_osm_point st_dwithin(way_geo, s.way_geo, 500, false) , amenity = 'bar' ) x union -- restaurants select s.school_id, 'rest.', x.* school s , lateral ( select osm_id, name, way_geo planet_osm_point st_dwithin(way_geo, s.way_geo, 500, false) , amenity = 'restaurant' ) x ) sub order school_id, (type <> 'school'), type, osm_id;
this not same original query, rather want, as per discussion in comments:
i want list of schools have restaurants , bars within 500 meters , need coordinates of each school , corresponding restaurants , bars.
so query returns list of schools, followed bars , restaurants nearby. each set of rows held osm_id
of school in column school_id
.
now using lateral
joins, make use of spatial gist index.
table school
shorthand select * school
:
the expression (type <> 'school')
orders school in each set first, because:
the subquery sub
in final select
needed order expression. union
query limits attached order by
list columns, no expressions.
i focus on query presented purpose of answer - ignoring extended requirement filter on of other 70 text columns. that's design flaw. search criteria should concentrated in few columns. or you'll have index 70 columns, , multicolumn indexes going propose hardly option. still possible though ...
index
in addition existing:
"idx_planet_osm_point_waygeo" gist (way_geo)
if filtering on same column, create multicolumn index covering few columns interested in, index-only scans become possible:
create index planet_osm_point_bar_idx on planet_osm_point (amenity, name, osm_id)
postgres 9.5
the upcoming postgres 9.5 introduces major improvements happen address case exactly:
allow queries perform accurate distance filtering of bounding-box-indexed objects (polygons, circles) using gist indexes (alexander korotkov, heikki linnakangas)
previously, common table expression required return large number of rows ordered bounding-box distance, , filtered further more accurate non-bounding-box distance calculation.
allow gist indexes perform index-only scans (anastasia lubennikova, heikki linnakangas, andreas karlsson)
that's of particular interest you. can have single multicolumn (covering) gist index:
create index reservations_range_idx on reservations using gist(amenity, way_geo, name, osm_id)
and:
- improve bitmap index scan performance (teodor sigaev, tom lane)
and:
- add group analysis functions
grouping sets
,cube
,rollup
(andrew gierth, atri sharma)
why? because rollup
simplify query suggested. related answer:
the first alpha version has been released on july 2, 2015. the expected timeline release:
this alpha release of version 9.5, indicating changes features still possible before release. postgresql project release 9.5 beta 1 in august, , periodically release additional betas required testing until final release in late 2015.
basics
of course, sure not overlook basics:
Comments
Post a Comment