How to Deal with I/O Expense

  • For a lot of problems, processors are fast compared to the cost of communicating with a hardware device.

    • Hardware is faster compared to communicating over a network

  • Other types of IO

    • disk

    • databases

    • file

    • network

    • hardware not close to cpu, ie usb, external drives etc

  • There is a lot of gain from improving IO costs

How

  • Caching

    • Caching is avoiding I/O (generally avoiding the reading of some abstract value) by storing a copy of that value locally so no I/O is performed to get the value.

      • Or putting it somewhere with less IO cost then where it was originally retrieved/accessed from

    • The first key to caching is to make it crystal clear which data is the master and which are copies.

      • There is only one master - period.

      • gossip protocol, master/slave protocol

    • Caching brings with it the danger that the copy sometimes can't reflect changes to the master instantaneously

  • Representation

    • is the approach of making I/O cheaper by representing data more efficiently.

    • This is often in tension with other demands, like human readability and portability.

    • ie

      • binary representation instead of one that is human readable

      • transmitting a dictionary of symbols along with the data so that long symbols don't have to be encoded,

      • and, at the extreme, things like Huffman encoding

  • pushing the computation closer to the data

    • improve the locality of reference

    • if you are reading some data from a database and computing something simple from it, such as a summation, try to get the database server to do it for you

    • If you are searching or sorting some data, instead of getting it all into memory, ask the database to do it for you and retrieve the results

Last updated

Was this helpful?