Changing Something From Iterating Over A Numpy Array To Vectorization
Solution 1:
You can significantly improve your performance using np.where
to get the indices where your conditions happen:
ind = np.where( flow_direction_np==32 )
you will see that ind
is a tuple with two elements, the first is the indices of the first axis and the second of the second axis of your flow_direction_np
array.
You can work out with this indices to apply the shifts: i-1
, j-1
and so on...
ind_32 = (ind[0]-1, ind[1]-1)
Then you use fancy indexing to update the arrays:
elevation_gain[ ind_32 ] += sediment_transport_np[ ind ]
EDIT: applying this concept to your case would give something like this:
lookup= {32:(-1, -1),
16:(0, -1),
8:(+1, -1),
4:(+1, 0),
64:(-1, 0),
128:(-1, +1),
1:(0, +1),
2:(+1, +1)}
fornum,shiftinlookup.iteritems():ind=np.where(flow_direction_np==num)ind_num=ind[0]+shift[0],ind[1]+shift[1]elevation_gain[ind_num]+=sediment_transport_np[ind]
Solution 2:
The reason that you're getting different results is due to the way python handles negative indexing.
For other folks reading, this question (and answer) are a follow up from here: Iterating through a numpy array and then indexing a value in another array
First off, I apologize that the "vectorized" code is as dense as it is. There's a through explanation in my earlier answer, so I won't repeat it here.
Your original code (in the original question) is actually subtly different than the version you posted here.
Basically, before you had
for [i, j], flow in np.ndenumerate(flow_direction_np):
try:
if flow == 32:
...
elif ...
...
and you were getting an index error when i+1
or j+1
was greater than the size of the grid.
Just doing:
for [i, j], flow in np.ndenumerate(flow_direction_np):
try:
if flow == 32:
...
elif ...
...
except IndexError:
elevation_change[i, j] = 0
is actually incorrect because it defines different boundary conditions on different sides of the grid.
In the second case, when j-1
or i-1
is negative, the value from the opposite side of the grid will be returned. However, when j+1
or i+1
is greater than the size of the grid, 0
will be returned. (Thus the "different boundary conditions".)
In the vectorized version of the code, 0
is returned both when the indices are negative and when they're beyond the bounds of the grid.
As a quick example, notice what happens with the following:
In [1]: x = [1, 2, 3]
In [2]: x[0]
Out[2]: 1In [3]: x[1]
Out[3]: 2In [4]: x[2]
Out[4]: 3In [5]: x[3]
---------------------------------------------------------------------------
IndexError Traceback (most recent calllast)
<ipython-input-5-ed224ad0520d>in<module>()
----> 1 x[3]
IndexError: list index outofrangeIn [6]: x[-1]
Out[6]: 3In [7]: x[-2]
Out[7]: 2In [8]: x[-3]
Out[8]: 1In [9]: x[-4]
---------------------------------------------------------------------------
IndexError Traceback (most recent calllast)
<ipython-input-9-f9c639f21256>in<module>()
----> 1 x[-4]
IndexError: list index outofrangeIn [10]:
Notice that negative indices up to the size of the sequence are valid and return the "opposite end" of the sequence. So, x[3]
raises an error, while x[-1]
just returns the other end.
Hopefully that's a bit clearer.
Post a Comment for "Changing Something From Iterating Over A Numpy Array To Vectorization"