mysql - how to get average of rows that have a certain relationship -


i have bunch of data stored pertaining county demographics in database. need able access average of data within in state of county. example, need able average of counties who's state_id matches state_id of county county_id of 1. essentially, if county in virginia, need average of of counties in virginia. i'm having trouble setting query, , hoping guys give me help. here's have written, returns 1 row database because of linking county_id of 2 tables together.

select avg(demographic_data.percent_white) avg_percent_white  demographic_data,counties, states  counties.county_id = demographic_data.county_id , counties.state_id = states.state_id 

here's basic database layout:

counties ------------------------ county_id | county_name  states --------------------- state_id | state_name  demographic_data ----------------------------------------- percent_white | percent_black | county_id 

your query returning 1 row, because there's aggregate , no group by. if want average of counties within state, we'd expect 1 row.

to "statewide" average, of counties within state, here's 1 way it:

select avg(d.percent_white) avg_percent_white   demographic_data d   join counties     on a.county_id = d.county_id   join counties o     on o.state_id = a.state_id   o.county_id = 42 

note there's no need join state table. need counties have matching state_id. query above using 2 references counties table. reference aliased "a" counties within state, reference aliased "o" state_id particular county.

if had state_id, wouldn't need second reference:

select avg(d.percent_white) avg_percent_white   demographic_data d   join counties     on a.county_id = d.county_id  a.state_id = 11 

followup

q if wanted bring in table.. let's call demographic_data_2 linked via county_id

a made assumption demographic_data table had 1 row per county_id. if same holds true second table, simple join operation.

  join demographic_data_2 c     on c.county_id = d.county_id  

with table joined in, add appropriate aggregate expression in select list (e.g. sum, min, max, avg).

the trouble spots typically "missing" , "duplicate" data... when there isn't row every county_id in second table, or there's more 1 row particular county_id, leads rows not included in aggregate, or getting double counted in aggregate.


we note aggregate returned in original query "average of averages". it's average of values each county.

consider:

bucket  count_red  count_blue  count_total  percent_red ------  ---------  ----------  -----------  -----------      1        480           4         1000           48      2         60           1          200           30 

note there's difference between "average of averages", , calculating average using totals.

select avg(percent_red) avg_percent_red      , sum(count_red)/sum(count_total) tot_percent_red  avg_percent_red  tot_percent_red ---------------  ---------------              39               45 

both values valid, don't want misinterpret or misrepresent either value.


Comments

Popular posts from this blog

java - Intellij Synchronizing output directories .. -

git - Initial Commit: "fatal: could not create leading directories of ..." -