python - Prioritizing Greenlet workers for parallel read/writes/db access -


i need read 3 csv files of size 10gb each , write parts of file 3 files. in middle, there minor conditions involved , mongo query(6bil collection, indexed) each row.

i thinking using gevent pools task not sure how prioritize read tasks on writes,ensuring read finished before writers exit out.

i dont want block writers until read finished.

  • i can spawn 3 readers put in queue.
  • i can spawn 20-25 io-processors read queue, mongodb call , write writer queue.
  • i can spawn 3 writers read write queue , write files.
  • i can joinall on pool.

now can keep queue timeout in io-processors , writers. ensure of readers have put complete data in queue? or possible put join on readers @ end of io-processors?

in short, want learn if there optimal approach use perform task efficiently.


Comments

Popular posts from this blog

How to access named pipes using JavaScript in Firefox add-on? -

multithreading - OPAL (Open Phone Abstraction Library) Transport not terminated when reattaching thread? -

node.js - req param returns an empty array -