Faster datastore_search in CKAN

CKAN’s datastore_search now comes with format options and is up to 17x faster.

This article covers:

  • new total row calculation
  • new result generation
  • new record formats

datastore_search Performance Improvements

TL;DR:

  1. Use CKAN 2.7 or later and resource views and other code that uses datastore_search is faster with no other changes required
  2. Update your datastore_search client code to use one of the new records_format=csv and/or include_total=false options to make it much, much faster
Continue reading…

Resample NumPy Array without Feature Loss

For pyrf I needed to take data from a frequency plot, which could be any number of points, and present it as a spectrogram that fills the view size exactly. In the spectrogram I only care about the maximum values that appear in the range of frequencies represented by each pixel.

If I could just divide the number of source bins by an integer factor the solution would be simple:

return np.amax(data.reshape((-1, factor)), axis=1)

But I have to be able to handle any number of source bins and output that to any number of pixels.

Fortunately numpy is awesome.

Continue reading…