sql - Spatial query on large table with multiple self joins performing slow -


i working on queries on large table in postgres 9.3.9. spatial dataset , spatially indexed. say, have need find 3 types of objects: a, b , c. criteria b , c both within distance of a, 500 meters.

my query this:

select    school.osm_id school_osm_id,    school.name school_name,    school.way school_way,    restaurant.osm_id restaurant_osm_id,    restaurant.name restaurant_name,    restaurant.way restaurant_way,    bar.osm_id bar_osm_id,    bar.name bar_name,    bar.way bar_way  (     select osm_id, name, amenity, way, way_geo      planet_osm_point      amenity = 'school') school,     (select osm_id, name, amenity, way, way_geo      planet_osm_point      amenity = 'restaurant') restaurant,     (select osm_id, name, amenity, way, way_geo      planet_osm_point      amenity = 'bar') bar  st_dwithin(school.way_geo, restaurant.way_geo, 500, false)    , st_dwithin(school.way_geo, bar.way_geo, 500, false); 

this query gives me want, takes long time, 13 seconds execute. i'm wondering if there way write query , make more efficient.

query plan:

nested loop  (cost=74.43..28618.65 rows=1 width=177) (actual time=33.513..11235.212 rows=10591 loops=1)    buffers: shared hit=530967 read=8733    ->  nested loop  (cost=46.52..28586.46 rows=1 width=174) (actual time=31.998..9595.212 rows=4235 loops=1)          buffers: shared hit=389863 read=8707          ->  bitmap heap scan on planet_osm_point  (cost=18.61..2897.83 rows=798 width=115) (actual time=7.862..150.607 rows=8811 loops=1)                recheck cond: (amenity = 'school'::text)                buffers: shared hit=859 read=5204                ->  bitmap index scan on idx_planet_osm_point_amenity  (cost=0.00..18.41 rows=798 width=0) (actual time=5.416..5.416 rows=8811 loops=1)                      index cond: (amenity = 'school'::text)                      buffers: shared hit=3 read=24          ->  bitmap heap scan on planet_osm_point planet_osm_point_1  (cost=27.91..32.18 rows=1 width=115) (actual time=1.064..1.069 rows=0 loops=8811)                recheck cond: ((way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision)) , (amenity = 'restaurant'::text))                filter: ((planet_osm_point.way_geo && _st_expand(way_geo, 500::double precision)) , _st_dwithin(planet_osm_point.way_geo, way_geo, 500::double precision, false))                rows removed filter: 0                buffers: shared hit=389004 read=3503                ->  bitmapand  (cost=27.91..27.91 rows=1 width=0) (actual time=1.058..1.058 rows=0 loops=8811)                      buffers: shared hit=384528 read=2841                      ->  bitmap index scan on idx_planet_osm_point_waygeo  (cost=0.00..9.05 rows=137 width=0) (actual time=0.193..0.193 rows=64 loops=8811)                            index cond: (way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision))                            buffers: shared hit=146631 read=2841                      ->  bitmap index scan on idx_planet_osm_point_amenity  (cost=0.00..18.41 rows=798 width=0) (actual time=0.843..0.843 rows=6291 loops=8811)                            index cond: (amenity = 'restaurant'::text)                            buffers: shared hit=237897    ->  bitmap heap scan on planet_osm_point planet_osm_point_2  (cost=27.91..32.18 rows=1 width=115) (actual time=0.375..0.383 rows=3 loops=4235)          recheck cond: ((way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision)) , (amenity = 'bar'::text))          filter: ((planet_osm_point.way_geo && _st_expand(way_geo, 500::double precision)) , _st_dwithin(planet_osm_point.way_geo, way_geo, 500::double precision, false))          rows removed filter: 1          buffers: shared hit=141104 read=26          ->  bitmapand  (cost=27.91..27.91 rows=1 width=0) (actual time=0.368..0.368 rows=0 loops=4235)                buffers: shared hit=127019                ->  bitmap index scan on idx_planet_osm_point_waygeo  (cost=0.00..9.05 rows=137 width=0) (actual time=0.252..0.252 rows=363 loops=4235)                      index cond: (way_geo && _st_expand(planet_osm_point.way_geo, 500::double precision))                      buffers: shared hit=101609                ->  bitmap index scan on idx_planet_osm_point_amenity  (cost=0.00..18.41 rows=798 width=0) (actual time=0.104..0.104 rows=779 loops=4235)                      index cond: (amenity = 'bar'::text)                      buffers: shared hit=25410  total runtime: 11238.605 ms 

i'm using 1 table @ moment 1,372,711 rows. has 73 columns:

       column       |         type         |       modifiers --------------------+----------------------+---------------------------  osm_id             | bigint               |   access             | text                 |   addr:housename     | text                 |   addr:housenumber   | text                 |   addr:interpolation | text                 |   admin_level        | text                 |   aerialway          | text                 |   aeroway            | text                 |   amenity            | text                 |   area               | text                 |   barrier            | text                 |   bicycle            | text                 |   brand              | text                 |   bridge             | text                 |   boundary           | text                 |   building           | text                 |   capital            | text                 |   construction       | text                 |   covered            | text                 |   culvert            | text                 |   cutting            | text                 |   denomination       | text                 |   disused            | text                 |   ele                | text                 |   embankment         | text                 |   foot               | text                 |   generator:source   | text                 |   harbour            | text                 |   highway            | text                 |   historic           | text                 |   horse              | text                 |   intermittent       | text                 |   junction           | text                 |   landuse            | text                 |   layer              | text                 |   leisure            | text                 |   lock               | text                 |   man_made           | text                 |   military           | text                 |   motorcar           | text                 |   name               | text                 |   natural            | text                 |   office             | text                 |   oneway             | text                 |   operator           | text                 |   place              | text                 |   poi                | text                 |   population         | text                 |   power              | text                 |   power_source       | text                 |   public_transport   | text                 |   railway            | text                 |   ref                | text                 |   religion           | text                 |   route              | text                 |   service            | text                 |   shop               | text                 |   sport              | text                 |   surface            | text                 |   toll               | text                 |   tourism            | text                 |   tower:type         | text                 |   tunnel             | text                 |   water              | text                 |   waterway           | text                 |   wetland            | text                 |   width              | text                 |   wood               | text                 |   z_order            | integer              |   tags               | hstore               |   way                | geometry(point,4326) |   way_geo            | geography            |   gid                | integer              | not null default nextval('... indexes:     "planet_osm_point_pkey1" primary key, btree (gid)     "idx_planet_osm_point_amenity" btree (amenity)     "idx_planet_osm_point_waygeo" gist (way_geo)     "planet_osm_point_index" gist (way)     "planet_osm_point_pkey" btree (osm_id) 

there 8811, 6291, 779 rows in amenity school, restaurant , bar respectively.

this query should go long way (be much faster):

with school (    select s.osm_id school_id, text 'school' type, s.osm_id, s.name, s.way_geo      planet_osm_point s         , lateral (       select  1 planet_osm_point         st_dwithin(way_geo, s.way_geo, 500, false)       ,     amenity = 'bar'       limit   1  -- bar exists -- selective first if possible       ) b         , lateral (       select  1 planet_osm_point         st_dwithin(way_geo, s.way_geo, 500, false)       ,     amenity = 'restaurant'       limit   1  -- restaurant exists       ) r     s.amenity = 'school'    ) select * (    table school  -- schools     union  -- bars    select s.school_id, 'bar', x.*      school s         , lateral (       select  osm_id, name, way_geo          planet_osm_point         st_dwithin(way_geo, s.way_geo, 500, false)       ,     amenity = 'bar'       ) x     union  -- restaurants    select s.school_id, 'rest.', x.*      school s         , lateral (       select  osm_id, name, way_geo          planet_osm_point         st_dwithin(way_geo, s.way_geo, 500, false)       ,     amenity = 'restaurant'       ) x    ) sub order school_id, (type <> 'school'), type, osm_id; 

this not same original query, rather want, as per discussion in comments:

i want list of schools have restaurants , bars within 500 meters , need coordinates of each school , corresponding restaurants , bars.

so query returns list of schools, followed bars , restaurants nearby. each set of rows held osm_id of school in column school_id.

now using lateral joins, make use of spatial gist index.

table school shorthand select * school:

the expression (type <> 'school') orders school in each set first, because:

the subquery sub in final select needed order expression. union query limits attached order by list columns, no expressions.

i focus on query presented purpose of answer - ignoring extended requirement filter on of other 70 text columns. that's design flaw. search criteria should concentrated in few columns. or you'll have index 70 columns, , multicolumn indexes going propose hardly option. still possible though ...

index

in addition existing:

"idx_planet_osm_point_waygeo" gist (way_geo) 

if filtering on same column, create multicolumn index covering few columns interested in, index-only scans become possible:

create index planet_osm_point_bar_idx on planet_osm_point (amenity, name, osm_id) 

postgres 9.5

the upcoming postgres 9.5 introduces major improvements happen address case exactly:

  • allow queries perform accurate distance filtering of bounding-box-indexed objects (polygons, circles) using gist indexes (alexander korotkov, heikki linnakangas)

    previously, common table expression required return large number of rows ordered bounding-box distance, , filtered further more accurate non-bounding-box distance calculation.

  • allow gist indexes perform index-only scans (anastasia lubennikova, heikki linnakangas, andreas karlsson)

that's of particular interest you. can have single multicolumn (covering) gist index:

create index reservations_range_idx on reservations using gist(amenity, way_geo, name, osm_id) 

and:

  • improve bitmap index scan performance (teodor sigaev, tom lane)

and:

  • add group analysis functions grouping sets, cube , rollup (andrew gierth, atri sharma)

why? because rollup simplify query suggested. related answer:

the first alpha version has been released on july 2, 2015. the expected timeline release:

this alpha release of version 9.5, indicating changes features still possible before release. postgresql project release 9.5 beta 1 in august, , periodically release additional betas required testing until final release in late 2015.

basics

of course, sure not overlook basics:


Comments

Popular posts from this blog

python - argument must be rect style object - Pygame -

webrtc - Which ICE candidate am I using and why? -

c# - Better 64-bit byte array hash -