python - How to subset pandas dataframe by two-column list of any length -


i have tried different combinations of boolean arrays , .isin constructions, pandas fu not strong enough.

if have following example dataframe:

in[1]:  import pandas pd         exampledf = pd.dataframe({ 'factor1' : ['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd'],                                    'factor2' : ['e', 'e', 'e', 'e', 'f', 'f', 'f', 'f'],                                    'numeric' : [1., 2., 3., 4., 5., 6., 7., 8.] }) 

i need pass list of factor1, factor2 pairs of length return subset of dataframe has combination of factors.

for example:

in[2]:  def factorfilter(df, factorlist):            # code goes here            # returns dataframe          factorfilter(exampledf, [['a', 'e'], ['c', 'f']])  out[2]:   factor1 factor2  numeric         0             e        1         6       f       f        7 

(if there's better way set lists, i'm ears, it's occurred me , easy produce , pass function).

you can utilize multi-index (index off more 1 column). 2 ways of building index example schema come mind.

import pandas pd index = pd.multiindex.from_product([list('abcd'),list('ef')],                                    names=['factor1','factor2']) 

or

factor1 = list('abcdabcd') factor2 = list('eeeeffff') index = pd.multindex.from_tuples(list(zip(factor1, factor2)),                                  names=['factor1', 'factor2']) 

from this, can create multi-index dataframe by

numerics = list(range(1,9)) df = pd.dataframe({'numeric': numerics}, index=index) 

df outputs

                 numeric factor1 factor2       e              1         f              2 b       e              3         f              4 c       e              5         f              6 d       e              7         f              8  [8 rows x 1 columns] 

then, can retrieve subset of indices, passing list of tuples ix property.

subdf = df.ix[[('a','e'), ('c','f')]] 

subdf outputs

                 numeric factor1 factor2       e              1 c       f              6  [2 rows x 1 columns] 

Comments

Popular posts from this blog

java - Intellij Synchronizing output directories .. -

git - Initial Commit: "fatal: could not create leading directories of ..." -